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INTRODUCTION 


Autism  spectrum  disorders  (ASD)  differ  from  other  developmental  disorders  in  that  children 
with  ASD  show  little  interest  in  other  people  (Kanner  1943).  This  lack  of  interest  is 
associated  with  other  complex  social  deficits,  including  empathy  and  shared  attention,  thus 
further  disrupting  the  capacity  to  engage  in  normal  social  interactions  (Batson  et  al  1981; 
Goldman  1993).  Research  findings  suggest  that  social  problems  in  ASD  derive,  in  part, 
from  dysfunction  in  the  neural  circuits  that  motivate  the  other-regarding  behaviors  that 
shape  normal  social  interactions  (Bowles  2006;  Nichols  2001).  Other-regarding 
preferences  (ORPs)  describe  a  concern  for  the  welfare  or  the  benefit  of  others 
(Dufwenberg  et  al  2008;  Fehr  &  Fischbacher  2003).  ORPs  may  rely  on  empathy,  a  social- 
cognitive  capacity  severely  compromised  in  ASD  (Baron-Cohen  et  al  1985),  and 
pathological  deficits  of  empathy  in  ASD  may  result  from  a  failure  to  understand  others’ 
internal  states  (Baron-Cohen  et  al  1985).  Accumulating  evidence  implicates  orbitofrontal 
cortex  (OFC)  and  medial  frontal  cortex,  including  the  anterior  cingulate  gyrus  (ACCg), 
dysfunction  in  the  pathophysiology  of  ASD  (Bachevalier  &  Loveland  2006;  Girgis  et  al 
2007;  Gilbert  et  al.,  2009).  Furthermore,  oxytocin  (OT),  a  neuromodulatory  hormone 
implicated  in  social  behavior  in  mammals,  has  also  been  implicated  in  the  etiology  of  ASD. 
Understanding  the  neuronal  properties  of  prefrontal  cortical  neurons  and  demonstrating 
OT-induced  changes  in  ORP  in  a  rhesus  macaque  model  will  significantly  advance  our 
understanding  of  social  processing  in  both  healthy  and  ASD  brains.  Our  research  aims  to 
develop  a  non-human  primate  model  for  ORP  specifically  designed  to  probe  these 
mechanisms  in  healthy  individuals,  the  neuronal  mechanisms  involved  in  the  expression  of 
ORPs,  and  the  efficacy  of  pharmacological  OT  therapies  designed  to  enhance  social 
interaction  in  ASD. 
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BODY 

Objectives  specified  in  the  approved  Project  Narratives 


Our  objectives  remain  unchanged  from  the  last  report. 

•  Objective  1 :  Develop  an  animal  model  of  ORPs. 

•  Objective  2:  Determine  how  OFC  neurons  mediate  ORPs. 

•  Objective  3:  Determine  the  effects  of  OFC  perturbations  on  ORPs. 

•  Objective  4:  Determine  whether  OT  can  enhance  positive  ORP. 


By  tasks  indicated  in  Statements  of  Work  (SOW) 

Task  1.  Characterize  neural  responses  in  the  orbitofrontal  cortex  (OFC) 

We  have  completed  this  task.  We  first  developed  an  other-regarding  preference 
(ORP)  task  in  pairs  of  rhesus  macaques.  We  found  that  actor  monkeys  prefer  cues  paired 
with  reward  to  a  recipient  monkey  over  cues  paired  with  reward  to  no  one,  displaying 
prosocial  preference.  By  contrast,  in  a  different  decision  context,  the  actors  prefer  cues 
paired  with  reward  to  self  over  cues  paired  with  reward  to  both  monkeys  simultaneously, 
displaying  antisocial  preference.  Rates  of  attention  to  M2  strongly  predicted  the  strength 
and  valence  of  vicarious  reinforcement.  These  patterns  of  behavior,  which  were  absent  in 
non-social  control  trials,  are  consistent  with  vicarious  reinforcement  based  upon  sensitivity 
to  the  rewarding  experiences  of  another  individual.  Vicarious  reward  may  play  a  critical 
role  in  shaping  cooperation  and  competition,  as  well  as  motivating  observational  learning 
and  group  coordination  in  rhesus  macaques,  much  as  it  does  in  humans.  The  detailed 
methods,  results,  and  figures  for  the  ORP  task  can  be  found  in  the  Appendix  1  and 
were  published  in  Frontiers  in  Decision  Neurosciences  in  2011  (Chang  et  al.,  2011). 

We  next  recorded  the  activity  of  85  single  orbitofrontal  (OFC)  neurons,  101  single 
anterior  cingulate  sulcus  (ACCs)  neurons,  and  81  anterior  cingulate  gyrus  (ACCg)  neurons 
from  two  donor  monkeys  performing  the  ORP  task.  We  found  that  OFC  neurons  encode 
rewards  that  are  delivered  to  oneself,  whereas  ACCg  neurons  encode  reward  allocations 
to  the  other  monkey,  to  oneself  or  to  both.  ACCs  neurons,  on  the  other  hand,  signaled 
reward  allocations  to  the  other  monkey  or  to  no  one.  In  this  network  of  received  (OFC)  and 
foregone  (ACCs)  reward  signaling,  ACCg  emerged  as  an  important  nexus  for  the 
computation  of  shared  experience  and  social  reward.  The  detailed  methods,  results,  and 
figures  for  the  neuronal  results  can  be  found  in  the  Appendix  2  and  were  published 
in  Nature  Neurosciences  in  2013  (Chang  et  al.,  2013) 


Task  2.  Characterize  behavior  after  muscimol  inactivation  of  prefrontal  cortex 

We  are  close  to  completing  the  task.  Our  neuronal  recording  results  strongly 
indicate  that  the  ACCg,  rather  than  OFC,  is  most  critical  for  social  representation  and 
social  learning  (Chang  et  al.,  2013).  Therefore,  we  tested  the  causality  of  ACCg  neurons  in 
addition  to  neurons  in  a  series  of  prefrontal  areas  in  social  versus  non-social  processing.  In 
order  to  be  more  time  efficient  as  well  as  to  generalize  the  finding  across  different 
behavioral  paradigms,  we  in  parallel  trained  new  monkeys  to  perform  observational  social 
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learning  tasks,  which  also  depend  on  sensitivity  to  the  rewarding  experiences  of  another 
individual.  We  tested  the  causal  contribution  of  ACCg  and  insula  to  social  learning  by 
investigating  whether  observer  monkeys  could  learn  the  value  of  a  novel  food  by  observing 
the  behavior  of  a  demonstrator  monkey  tasting  the  food.  We  tested  social  learning 
following  muscimol  inactivation  of  the  neuronal  populations  in  ACCg,  in  which  we  found 
social  encoding  of  donated  and  shared  rewards,  and  in  the  insula,  known  to  be  involved  in 
direct  experience  learning,  especially  with  disgust  learning,  compared  with  saline  control 
injections.  The  insula  is  known  to  be  hypoactive  in  ASD  (Uddin  and  Menon,  2009). 

We  hypothesize  that  ACCg  is  an  essential 
node  of  the  neural  network  mediating  the  acquisition 
of  positive  and  negative  food  preferences  through 
social  learning  (Figure  1),  consistent  with  the  finding 
that  ACCg  neurons  encode  the  rewarding 
experiences  of  another  individual  (Chang  et  al., 

2013).  By  contrast,  we  hypothesize  that  the  insula  is 
only  involved  in  non-social  learning  through  direct 
experience.  ACCg  has  been  implicated  in  empathy 
and  social  learning  in  humans.  We  previously 
showed  that  neurons  in  ACCg  respond  when 
monkeys  choose  to  give  juice  to  another  monkey 
while  neurons  in  OFC  respond  when  monkeys 
choose  to  give  juice  to  themselves  (Chang  et  al., 

2013).  Damage  to  ACCg,  but  not  OFC,  causes 
disruptions  in  social  behavior  in  monkeys  (Rudebeck 
et  al.,  2009). 

In  these  experiments,  one  monkey,  the 
demonstrator,  sits  in  a  primate  chair  and  eats  food  offered  to  him.  Another  monkey,  the 
actor,  observes  the  demonstrator.  The  foods  offered  are  colored  “pearls”  made  from 
gelatin  and  either  “good”  (citrus  or  berry;  sweet)  or  “bad”  (quinine;  bitter)  flavoring.  Each 
of  the  two  types  of  pearl  used  per  session  is  assigned  a  distinctive  color  (e.g.  green 
good  pearls  and  blue  bad  pearls),  which  changes  each  session.  Sessions  alternate 
between  demonstration  sessions  and  test  sessions.  In  demonstration  sessions,  the 
demonstrator  is  given  the  opportunity  to  eat  10  good  pearls  and  10  bad  pearls,  one  at  a 
time,  in  random  order.  In  pilot  studies,  after  sampling  a  good  pearl  the  demonstrator 
monkey  consumes  it  and  then  reaches  quickly  for  another  when  it  is  offered;  after 
sampling  a  bad  pearl,  the  demonstrator  monkey  displays  a  distasteful  facial  expression, 
spits  out  the  pearl,  and  rejects  bad  pearls  on  subsequent  trials  by  throwing  them  away. 
Observer  monkeys  subsequently  direct  their  gaze  to  palatable  foods  and  display  tongue 
protrusion  and  licking.  By  measuring  visual  orienting  and  tongue  protrusion  while 
presenting  the  good  and  bad  pearls,  we  assess  learning  of  food  preferences  without 
giving  subjects  direct  access  to  food.  These  data  are  then  compared  with  food 
preferences  subjects  develop  when  permitted  to  directly  sample  pearls.  After  observer 
monkeys  display  preferences  for  the  good  foods  learned  from  social  observation  and 
through  direct  experience,  we  invert  the  flavor/color  associations.  We  found  that 
observer  monkeys  typically  relearned  the  novel  food  values  within  the  first  session. 


Figure  1 .  Brain  areas  involved  in  social  and 
nonsocial  learning.  Based  on  Chang  et  al.,  2013, 
we  hypothesize  that  ACCg  is  causally  involved  in 
social  learning,  whereas  the  insula  is  causally 
involved  in  non-social  learning  through  direct 
experience. 
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We  have  systematically  explored  the 
neural  circuitry  necessary  for  social  learning  by 
inactivating  these  neuronal  populations  using  the 
GABA  agonist  muscimol.  Our  preliminary  data 
(Figure  2)  show  that  reversible  pharmacological 
inactivation  of  ACCg  severely  impairs  social 
learning  (baseline  good  food:  2.41  ±  0.36  vs  bad: 

1.35  ±  0.30  seconds  spent  looking  at  food;  after 
muscimol  injection,  good:  1.54  ±  0.35  vs  bad: 

2.44  ±  0.41 ;  t-test,  both  p  <  0.05).  However,  non¬ 
social  learning  from  direct  experience  was  not 
blocked  (baseline  good:  2.92  ±  0.50  vs  bad:  1.76 
±  0.37  seconds  spent  looking  at  food;  after  the 
muscimol  injection,  good:  1.91  ±  0.38  vs  bad: 

1.23  ±  0.23;  t-test,  both  p  <  0.05).  These  results 
come  from  12  injections  in  two  monkeys  By 
contrast,  we  found  the  opposite  patterns  in  insula 
-  inactivating  insula  neurons  impaired  learning 
from  direct  experience  (baseline  good:  2.11  ± 

0.10  vs  bad:  0.99  ±  0.05  seconds  spent  looking  at 
food;  after  the  muscimol  injection,  good:  0.45  ± 

0.04  vs  bad:  1 .02  ±  0.05;  t-test,  p  <  0.05  for  both). 

However,  social  learning  was  intact  (baseline 
good:  2.16  ±  0.10  vs  bad:  1.26  ±  0.07  seconds 
spent  looking  at  food;  after  the  muscimol  injection,  good:  1.01  ±  0.05  vs  bad:  0.52  ± 
0.04;  t-test,  p  <  0.05  for  both).  These  results  come  from  8  injections  of  muscimol  in  two 
monkeys.  These  results  provide  causal  evidence  that  ACCg  is  specialized  for 
incorporating  the  experience  of  another  individual  into  one’s  own  behavior,  a  form  of 
empathy,  whereas  insula  contributes  to  behavior  modified  by  direct  experience.  We  are 
currently  analyzing  whether  muscimol  inactivation  of  ACCg  and  insula  differentially 
influence  social  attention  during  social  and  non-social  learning.  A  manuscript  based  on 
this  work  is  currently  in  preparation  (Gariepy  et  al.,  in  prep). 

We  are  currently  testing  the  causal  contributions  in  OFC  neuronal  populations  to 
compare  the  results  directly  across  ACCg,  OFC,  and  the  insula  for  their  roles  in  social  and 
experience-based  learning. 


0  Bad  tasting  - -  0  Good  tasting 

Q  Good  tasting  — *  Bad  tasting 


Control  Muscimol  ACCg  Saline  ACCg 

(1  day)  (2  days)  (2  days) 


Figure  2.  Muscimol  injections  into  ACCg  impaired 
social  learning  (top  curves)  but  not  direct  learning 
(bottom  curves)  of  food  palatability. 


Task  3.  Determine  behavioral  response  to  microstimulation 
We  have  not  yet  begun  this  task. 


Task  4.  Examine  the  effect  of  oxytocin  (OT)  on  task  performance 

We  have  completed  this  task.  We  showed,  for  the  first  time  in  a  monkey  or  a 
human,  that  inhaling  OT  (using  pediatric  nebulizer)  penetrates  the  central  nervous  system 
and  subsequently  enhances  the  sensitivity  of  rhesus  macaques  to  rewards  occurring  to 
others  as  well  as  themselves  in  the  ORP  task.  Roughly  2  hours  after  inhaling  OT,  donor 
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monkeys  increased  the  frequency  of  prosocial  choices  associated  with  reward  to  another 
monkey  (i.e.,  recipient  monkey)  when  the  alternative  was  to  reward  no  one.  OT  also 
increased  attention  to  the  recipient  monkey  as  well  as  the  time  it  took  to  make  such  a 
decision.  In  contrast,  within  the  first  2  hours  following  inhalation,  OT  enhanced  selfish 
choices  associated  with  delivery  of  reward  to  self  over  a  reward  to  the  other  monkey, 
without  influencing  attention  or  decision  reaction  times.  Thus,  inhaling  OT  causally 
promotes  prosocial  behavior  in  rhesus  monkeys  when  there  is  no  perceived  cost  to  self. 
These  findings  potentially  validate  the  use  of  inhaled  OT  as  a  potential  therapeutic  for 
enhancing  social  attention  and  prosocial  behavior  in  ASD.  This  study  also  pioneered  the 
use  of  a  pediatric  nebulizer  to  deliver  OT  to  the  brain,  a  method  that  may  be  well-tolerated 
by  children.  The  detailed  methods,  results,  and  figures  can  be  found  in  the  Appendix 
3  and  were  published  in  Proceedings  of  the  National  Academy  of  Sciences  in  2012 
(Chang  et  al.,  2012). 
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KEY  RESEARCH  ACCOMPLISHMENTS  [see  REPORTABLE  OUTCOMES] 


•  Demonstrated  that  rhesus  monkeys  are  sensitive  to  the  experiences  of  others  in  the 
reward  donation  task  (Presented  at  multiple  meetings,  Paper  Published:  Chang  et 
al.,  2011) 

•  Development  of  intranasal  oxytocin  (OT)  protocol  in  rhesus  monkeys  with  a 
confirmation  that  the  method  effectively  delivers  OT  to  the  central  nervous  system 
(Meeting  Presentations,  Paper  Published:  Chang  et  al.,  2012).  Now  this  method  is 
the  standard  in  the  field  for  delivering  OT  to  nonhuman  primates,  and  is  being 
investigated  for  use  in  children. 

•  Demonstrated  that  OT  enhances  prosocial  behavior  and  social  attention  in  rhesus 
monkeys  (Presented  at  multiple  meetings)  (Meeting  Presentations,  Paper 
Published:  Chang  et  al.,  2012) 

•  Discovered  that  the  orbitofrontal  cortex  (OFC)  does  not  play  a  critical  role  in 
processing  information  about  the  experiences  of  others,  a  basic  component  of 
empathy  (Meeting  Presentations,  Paper  Published:  Chang  et  al.,  2013) 

•  Discovered  that  the  anterior  cingulate  gyrus  (ACCg)  plays  a  critical  role  in 
processing  information  about  the  experience  of  others,  a  basic  component  of 
empathy  (Meeting  Presentations,  Paper  Published:  Chang  et  al.,  2013) 

•  Extension  of  the  current  experiments  to  other  prefrontal  brain  regions  (the  sulcus 
and  gyrus  of  the  anterior  cingulate  cortex)  to  better  understand  how  ORP-related 
signals  differ  across  different  parts  of  the  prefrontal  cortex  (Meeting  Presentations, 
Paper  Published:  Chang  et  al.,  2013) 

•  Extension  of  the  reward  donation  findings  to  social  learning  to  generalize  how 
ACCg,  OFC,  and  the  insula  contribute  to  social  behavior  (Meeting  Presentations, 
Paper  Published:  Chang  et  al.,  2013) 

•  Demonstration  that  ACCg  contributes  causally  to  social  learning  and  the  insula 
contributes  to  learning  from  direct  gustatory  experience  (Meeting  Presentations, 
Paper  in  prep:  Gariepy  et  al.) 
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1.  Gariepy  JF,  Chang  SWC,  Du  E,  Erb  J,  Platt  ML  (in  revision)  Neuronal  basis  of 
deceptive  behaviour  in  rhesus  macaques.  Nature  Neurosc. 

2.  Gariepy  JF,  Watson  KK,  Du  E,  Xie  DL,  Erb  J,  Amasino  D,  Platt  ML  (2013,  in 
revision)  Social  learning  in  humans  and  other  animals.  Frontiers  in  Decision 
Neuroscience. 

3.  Gariepy  JF,  Du  E,  Xie  DL,  Platt  ML  (in  prep)  Neural  basis  of  social  learning  in 
macaques. 

4.  Chang  SW  and  Platt  ML  (in  revision)  Oxytocin  and  social  cognition  in  rhesus 
macaques:  Implications  for  understanding  and  treating  human  psychopathology. 
Brain  Research:  Special  Issue  in  Oxytocin  and  Social  Behavior. 

5.  Chang  SW,  Gariepy  JF,  and  Platt  ML  (2013)  Neuronal  reference  frames  for  social 
decisions  in  primate  prefrontal  cortex.  Nat.  Neurosci.,  16,  243-250. 

6.  Gariepy  JF,  Chang  SW  and  Platt  ML  (2013)  Brain  games:  Toward  a  neuroecology 
of  social  behavior.  Invited  commentary  in  Beh.  Brain.  Sci.,  36,  424-5. 

7.  Chang  SW,  Barack  DL  and  Platt  ML  (2012)  Mechanistic  classification  of  neural 
circuit  dysfunctions:  Insights  from  neuroeconomics  research  in  animals.  Biol. 
Psychiatry,  72:101-106. 

8.  Chang  SW,  Barter  JW,  Ebitz  RB,  Watson  KK  and  Platt  ML  (2012)  Inhaled  oxytocin 
amplifies  both  vicarious  reinforcement  and  self  reinforcement  in  rhesus  macaques 
(Macaca  mulatta).  Proc  Natl  Acad  Sci,  1 09,  959-964. 

9.  Chang  SW,  Winecoff  AA,  and  Platt  ML  (2011)  Vicarious  reinforcement  in  rhesus 
macaques  (Macaca  mulatta).  Front.  Neurosci.,  5,  27. 


B.  Meeting  Abstracts 

1.  Gariepy  JF.  Prefrontal  contributions  to  social  Learning  and  decision-making  in 
rhesus  macaques.  Neural  circuits  for  adaptive  control  of  behavior  (Paris,  France), 
2013  (Talk) 

2.  Gariepy  JF,  Chang  SW,  Du  E,  Platt  ML.  Neural  basis  of  deceptive  tactics  in  the 
primate  prefrontal  cortex.  The  Assembly  and  Function  of  Neural  Circuits  (Monte 
Verita,  Switzerland)  2013  (Poster) 

3.  Gariepy  JF,  Du  E,  Xie  D,  and  Platt  ML.  Neural  basis  of  social  learning  in  rhesus 
macaques.  Society  for  Neuroscience  (San  Diego,  CA),  2013  (Poster) 
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4.  Xie  D,  Gariepy  JF,  Du  E,  and  Platt  ML.  Inhaling  oxytocin  increases  contagious 
yawning  in  rhesus  macaques.  Society  for  Neuroscience  (San  Diego,  CA),  2013 
(Poster) 

5.  Chang  SW,  Gariepy  JF,  and  Platt  ML.  Differential  encoding  of  social  decision 
outcomes  by  neurons  in  primate  orbitofrontal  cortex,  dorsal  anterior  cingulate  cortex 
and  anterior  cingulate  gyrus.  Society  for  Neuroscience  (New  Orleans,  LA),  2012 
(Talk)  &  Contributed  Talk  at  Society  for  Social  Neuroscience,  2012 

6.  Gariepy  JF,  Chang  SW,  Du  E.  and  Platt  ML.  Neural  correlates  of  deceptive  tactics 
in  the  primate  prefrontal  cortex.  Society  for  Neuroscience  (New  Orleans,  LA),  2012 
(Talk)  (also  for  Society  for  Social  Neuroscience) 

7.  Chang  SW,  Gariepy  JF,  and  Platt  ML.  Neuronal  reference  frames  for  social 
decisions  in  primate  prefrontal  cortex.  Organization  for  Computational  Neuroscience 
(Atlanta,  GA),  2012  (Poster) 

8.  Platt,  ML.  Neuronal  basis  of  giving  and  receiving.  Organization  for  Computational 
Neuroscience  (Atlanta,  GA),  2012,  invited  talk. 

9.  Gariepy  JF,  Chang  SW,  Du  E.  and  Platt  ML.  Neural  correlates  of  deceptive  tactics 
in  the  primate  prefrontal  cortex.  Tenth  International  Congress  of  Neuroethology, 
2012  (Poster) 

10.  Chang  SW  and  Platt  ML.  Differential  coding  of  egocentric  and  allocentric  reward 
outcomes  during  social  interaction  in  primate  ACC  and  OFC.  Society  for 
Neuroscience  (Washington,  DC),  2011  (Talk) 

11.  Chang  SW,  Barter  JW,  Ebitz  RB,  Watson  KK  and  Platt  ML.  Oxytocin  promotes 
prosocial  decisions  in  rhesus  macaques.  Society  for  Neuroscience  (Washington, 
DC),  201 1  (Poster) 

12.  Chang  SW,  Barter  JW,  Ebitz  RB,  Watson  KK  and  Platt  ML.  Inhaled  oxytocin 
amplifies  both  vicarious  reinforcement  and  self  reinforcement  in  rhesus  macaques 
(Macaca  mulatta).  Workshop  on  the  Biology  of  Prosocial  Behavior  at  Emory 
University  (Atlanta,  GA),  2011  (Poster) 

13.  Chang  SW  and  Platt  ML.  Social  Context  Gates  Other-Regarding  Preferences  in 
Rhesus  Macaques  ( Macaca  mulatta).  Society  for  Neuroscience  (San  Diego,  CA), 
2010  (Poster) 

14.  Chang  SW  and  Platt  ML.  Social  Context  Gates  Other-Regarding  Preferences  in 
Rhesus  Macaques  ( Macaca  mulatta).  A  Brain  Research  meeting:  The  Emerging 
Neuroscience  of  Autism  Spectrum  Disorders  (San  Diego,  CA),  2010  (Poster) 


C.  Research  Support  (built  upon  this  award) 


to 


1.  NIH/NIMH  2/21/12-11/30/16 

R01  MH095894-01  {Platt) 

Neuronal  basis  of  vicarious  reinforcement  dysfunction  in  autism  spectrum  disorder 
The  goal  of  this  project  is  to  understand  the  role  of  prefrontal  cortex  in  mediating 
vicarious  reinforcement  during  reward  allocation  decisions 

2.  Duke  Department  of  Neurobiology  6/01/1 1  -  5/31/12 

Postdoctoral  Training  Award  in  Fundamental  &  Translational  Neuroscience 
NIH/NINDS  T32  NS051 156-07  {Chang) 

Neural  basis  of  other-regarding  preference 

The  goal  of  this  project  is  to  understand  the  role  of  anterior  cingulate  cortex 
and  orbitofrontal  cortex  during  reward  allocation  decisions 

3.  NIH/NIMH  9/01/12-8/31/17 

NIH  K99/R00  Pathway  to  Independence  {Chang) 

Role  of  oxytocin  in  the  amygdala-prefrontal  network  during  social  decision-making 
The  goal  of  this  project  is  to  undergo  extensive  training  in  neuroendocrinology,  and 
study  the  mechanisms  underlying  oxytocin-mediated  neural  processing  across 
amygdala  and  prefrontal  neurons  in  social  decision-making. 

4.  FRQS  9/01/12-9/01/15 

25559  Post-doctoral  award  (Gariepy) 

Neural  basis  of  social  behaviors  in  rhesus  macaques. 

The  goal  of  this  project  is  to  identify  the  regions  of  the  prefrontal  cortex  necessary  for 
learning  of  food  quality  by  direct  experience  and  by  social  observation. 


D.  Mentoring 

1.  A  successful  rotation  project  for  a  Duke  Cognitive  Neuroscience  PhD  candidate, 
Amy  A.  Winecoff,  resulting  in  a  second  authorship  in  Chang  et  al.,  201 1 . 

2.  A  successful  rotation  project  for  a  Duke  Cognitive  Neuroscience  PhD  candidate, 
Joseph  W.  Barter,  resulting  in  a  second  authorship  in  Chang  et  al.,  2012. 

3.  A  successful  launch  of  many  related  projects  by  Dr.  Jean-Francois  Gariepy 

4.  A  successful  rotation  project  for  a  Duke  Cognitive  Neuroscience  PhD  candidate, 
Amanda  V.  Utevsky. 

5.  A  successful  transition  to  a  faculty  position  for  Dr.  Steve  Chang  (Yale  Univ.) 

6.  Successful  mentoring  of  Emily  Du  and  Diana  L.  Xie  (Duke  University)  and  Joshua 
Erb  (Columbia  University)  as  undergraduate  interns  who  have  worked  on  these 
projects. 
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CONCLUSION 


Other-regarding  preferences  (ORPs)  are  critical  for  normal  social  behavior,  and  the 
neural  mechanisms  underlying  ORPs  may  be  disrupted  in  neuropsychiatric  disorders 
marked  by  social  deficits,  including  autism  spectrum  disorders  (ASD).  Both  reward-related 
processing  in  the  brain  and  the  neuropeptide  oxytocin  (OT)  have  been  implicated  in  ASD. 
However,  the  neural  mechanisms  underlying  ORPs  remain  elusive,  partly  due  to  the  lack 
of  a  good  animal  model  for  studying  complex  social  behavior.  To  address  this  gap,  we 
developed  a  novel  social  interaction  task  involving  two  rhesus  macaques,  and  investigated 
the  role  of  prefrontal  cortical  neurons,  previously  implicated  in  motivation  and  decision¬ 
making,  as  well  as  the  neuropeptide  OT,  which  has  previously  been  implicated  in  social 
preferences,  during  the  expression  of  ORPs. 

We  found  that  rhesus  monkeys  care  about  what  happens  to  others,  as  indicated  by 
their  preference  to  deliver  juice  rewards  to  a  recipient  monkey  over  no  one  and  increased 
attention  to  the  recipient  monkey  following  prosocial  decisions.  Inhalation  of  OT  by 
monkeys  increases  both  the  frequency  of  prosocial  decisions  and  attention  to  the  recipient 
monkey.  Neuronal  recording  from  OFC  revealed  that  OFC  neurons  track  directly 
experienced  rewards  during  social  interactions.  By  contrast,  ACCg  neurons  signaled 
rewards  delivered  to  another  individual  as  well  as  shared  rewards.  Further  testing  revealed 
that  ACCg  is  specialized  for  social  learning,  whereas  the  insula  is  not  but  rather  serves 
learning  from  direct  experience.  Our  studies  thus  have  begun  to  reveal  how  the  primate 
brain  makes  decisions  during  social  interaction  with  other  individuals. 

OT  has  been  evaluated  for  potential  therapeutic  use  in  clinical  conditions  marked  by 
social  deficits,  such  as  ASD,  antisocial  personality  disorder,  and  schizophrenia.  Notably, 
the  nebulization  method  we  developed  demonstrated  that  inhaled  OT  actually  translocates 
to  the  central  nervous  system.  Moreover,  nebulization  is  well-tolerated  by  children  for 
delivery  of  other  therapeutics  (e.g.,  albuterol),  thus  opening  up  avenues  for  early  OT 
intervention  in  childhood. 

Our  findings  provide  new  opportunities  for  uncovering  the  neurophysiological  and 
neuroendocrinological  mechanisms  underlying  complex  social  behavior  in  a  species  much 
more  closely  related  to  humans  than  mice  or  rats.  Rhesus  monkeys  have  long  served  as 
the  preferred  model  species  for  probing  the  neural  mechanisms  underlying  complex 
cognition.  Given  the  strong  similarities  in  social  behavior  and  cognition,  together  with 
remarkable  homologies  in  neural  circuitry,  the  rhesus  macaque  provides  a  powerful  model 
for  probing  the  neurobiological  mechanisms  of  social  interactions  in  people. 


Medical  Implications  (“So  What”  section) 

Our  work  holds  promise  both  for  understanding  the  basic  mechanisms  that  support 
complex  social  behavior  and  translating  that  knowledge  into  improved  treatment  for  social 
dysfunction  in  ASD.  In  particular,  our  work  tests  the  idea  that  empathy  derives  from  the 
activation  of  neural  circuits  that  process  primary  emotions  or  feelings,  such  as  reward  or 
punishment,  merely  by  observing  the  same  things  happen  to  other  people.  OT  therapy  for 
ASD  and  other  neuropsychiatric  disorders  is  currently  being  explored  in  clinical  trials, 


12 


despite  uncertainty  regarding  the  exact  mechanism  of  action  in  the  brain  or  the  long-term 
consequences  of  use.  By  testing  this  drug  in  an  animal  model,  we  can  directly  confirm 
efficacy,  efficiency,  and  long-term  safety.  Clinicians  can  use  this  information  to  directly 
inform  therapeutic  interventions  in  ASD.  We  demonstrated,  for  the  first  time  in  any  species, 
that  inhaled  OT  is  taken  up  by  the  central  nervous  system — an  important  prerequisite  for 
exploring  further  clinical  opportunities. 

Our  work  promises  ancillary  benefits  as  well.  Our  findings  regarding  the  functional 
role  of  the  OFC,  ACCg,  and  now  the  insula  will  also  be  of  use  in  clinical  contexts.  The 
precise  way  that  prefrontal  cortical  circuits  contribute  to  social  behavior  remains  poorly 
understood.  Our  studies  have  begun  to  sketch  out  how  these  circuits  mediate  social 
behavior.  Our  findings  may  prove  invaluable  in  the  diagnosis  and  treatment  of  social 
behavioral  disorders  that  accompany  head  trauma,  with  potentially  important  implications 
for  veterans  of  US  armed  forces  returning  from  the  battlefield  suffering  from  traumatic  brain 
injuries  and  attendant  problems  in  adjusting  to  civilian  life. 

Ultimately,  the  results  of  our  studies  will  inform  therapeutic  interventions  for  social 
disorders,  on  both  the  pharmacological  and  behavioral  levels,  and  significantly  improve  the 
lives  of  people  living  with  ASD  and  other  individuals  struggling  with  social  life. 
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What  happens  to  others  profoundly  influences  our  own  behavior.  Such  other-regarding  outcomes 
can  drive  observational  learning,  as  well  as  motivate  cooperation,  charity,  empathy,  and  even 
spite.  Vicarious  reinforcement  may  serve  as  one  of  the  critical  mechanisms  mediating  the 
influence  of  other-regarding  outcomes  on  behavior  and  decision-making  in  groups.  Here  we 
show  that  rhesus  macaques  spontaneously  derive  vicarious  reinforcement  from  observing 
rewards  given  to  another  monkey,  and  that  this  reinforcement  can  motivate  them  to  subsequently 
deliver  or  withhold  rewards  from  the  other  animal.  We  exploited  Pavlovian  and  instrumental 
conditioning  to  associate  rewards  to  self  (Ml)  and/or  rewards  to  another  monkey  (M2)  with 
visual  cues.  Mis  made  more  errors  in  the  instrumental  trials  when  cues  predicted  reward  to 
M2  compared  to  when  cues  predicted  reward  to  Ml ,  but  made  even  more  errors  when  cues 
predicted  reward  to  no  one.  In  subsequent  preference  tests  between  pairs  of  conditioned 
cues,  Mis  preferred  cues  paired  with  reward  to  M2  over  cues  paired  with  reward  to  no  one. 
By  contrast,  Mis  preferred  cues  paired  with  reward  to  self  over  cues  paired  with  reward  to 
both  monkeys  simultaneously.  Rates  of  attention  to  M2  strongly  predicted  the  strength  and 
valence  of  vicarious  reinforcement. These  patterns  of  behavior,  which  were  absent  in  non-social 
control  trials,  are  consistent  with  vicarious  reinforcement  based  upon  sensitivity  to  observed, 
or  counterfactual,  outcomes  with  respect  to  another  individual.  Vicarious  reward  may  play 
a  critical  role  in  shaping  cooperation  and  competition,  as  well  as  motivating  observational 
learning  and  group  coordination  in  rhesus  macaques,  much  as  it  does  in  humans.  We  propose 
that  vicarious  reinforcement  signals  mediate  these  behaviors  via  homologous  neural  circuits 
involved  in  reinforcement  learning  and  decision-making. 


Keywords:  vicarious  reinforcement,  social  reward,  gaze,  social  interaction,  rhesus  macaques 


INTRODUCTION 

Reinforcement  learning  provides  a  powerful  mechanism  for  asso¬ 
ciating  stimuli  and  actions  with  the  direct  experience  of  reward 
and  punishment  (Rescorla  and  Wagner,  1972;  Schultz  et  al.,  1997; 
Sutton  and  Barto,  1998).  Behavioral  and  neurobiological  evidence 
indicate  that  human  behavior  also  depends  on  outcomes  that  have 
not  been  directly  experienced.  For  example,  Active,  or  counterfac¬ 
tual,  learning  describes  sensitivity  to  reward  outcomes  for  options 
that  were  not  chosen,  were  merely  observed,  or  were  even  imag¬ 
ined  (Byrne,  2002;  Lohrenz  et  al.,  2007;  Epstude  and  Roese,  2008). 
Fictive  learning  can  be  described  formally  in  terms  analogous  to 
reinforcement  learning  (Lohrenz  et  al.,  2007),  and  may  depend  on 
overlapping  neural  circuitry  (Lohrenz  et  al.,  2007;  Hayden  et  al., 
2009;  Mobbs  et  al.,2009). 

Observing  what  happens  to  others  also  powerfully  shapes  human 
learning  and  behavior  (Berger,  1962;  Bandura  and  McDonald,  1963; 
Bandura  et  al.,  1963).  Such  other-regarding  outcomes  can  drive 
observational  learning  (Mobbs  et  al.,  2009;  Jeon  et  al.,  2010),  and 
motivate  other- regarding  behaviors  such  as  cooperation  and  char¬ 
ity,  as  well  as  spite  and  schadenfreude  (Takahashi  et  al.,  2009).  The 
“warm  glow”  hypothesis  (Andreoni,  1990)  suggests  that  vicari¬ 
ous  reward  and  punishment  motivates  individuals  to  prefer  either 


positive  or  negative  outcomes  to  others  (Bandura  et  al.,  1963;  Fehr 
and  Fischbacher,  2003;  Mobbs  et  al.,  2009).  Human  social  emotions 
associated  with  vicarious  reward  and  punishment,  such  as  fairness 
and  envy,  appear  early  in  ontogeny,  and  their  derangement  in  men¬ 
tal  disorders  like  psychopathy  can  have  devastating  consequences 
(Kiehl,  2006). 

Such  observations  endorse  the  idea  that  neural  mechanisms 
supporting  vicarious  reinforcement  are  derived  specializations  of 
the  human  brain,  which  support  complex  social  behavior  includ¬ 
ing  observational  learning,  cooperation,  and  even  altruism  (Fehr 
and  Fischbacher,  2003).  Though  highly  specialized  for  complex 
social  behavior  in  humans,  these  mechanisms  appear  to  have  deep 
evolutionary  roots.  Behavioral  and  neurobiological  evidence  dem¬ 
onstrate  rudimentary  forms  of  fictive,  observational,  and  social 
learning  in  non-human  animals.  Rhesus  macaques,  for  example, 
learn  from  fictive  outcomes  and  this  process  appears  to  be  sup¬ 
ported  by  the  same  circuitry  mediating  fictive  learning  in  humans 
(Hayden  et  al.,  2009).  In  some  species,  learning  to  perform  a  task 
is  facilitated  by  watching  others  learn  the  same  task  (Zentall  and 
Levine,  1972;  Zentall  et  al.,  1996;  Drea  and  Wallen,  1999;  Subiaul 
et  al.,  2004;  Whiten  et  al.,  2009).  Chimpanzees  are  capable  of  learn¬ 
ing  to  use  complex  tools  by  observing  others  (Tomasello  et  al., 
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1987),  and  their  observational  learning  seems  to  be  contingent  on 
the  associative  strength  of  observed  action  and  outcome  (Crawford 
and  Spence,  1921).  Observing  another  mouse  receive  a  shock  can 
drive  fear  conditioning  in  the  observer,  and  this  observational  fear 
conditioning  depends  on  affective  pain  circuitry  that  has  been 
implicated  in  empathy  in  humans  (Jeon  et  al,  2010). 

Whether  mere  observation  of  rewarding  events  occurring  to 
another  individual  can  drive  the  expression  of  social  preferences 
in  non-human  animals,  as  proposed  by  the  “warm  glow”  model, 
however,  remains  debated.  Some  have  argued  that  the  expression 
of  other- regarding  preferences  in  humans  reflects  the  evolution 
of  mechanisms  that  promote  cooperative  reproduction,  but  the 
evidence  for  other- regarding  behaviors  in  cooperatively  breeding 
animals  remains  controversial  (Burkart  et  al.,  2007;  de  Waal  et  al., 
2008;  Lakshminarayanan  and  Santos,  2008;  Massen  et  al.,  2010). 
Others  have  argued  that  only  those  species  most  closely  related  to 
humans,  namely  chimpanzees  and  bonobos,  possess  the  derived 
features  of  human  biology  and  cognition,  in  particular  “theory  of 
mind”  (Call  and  Tomasello,  2008),  express  other- regarding  prefer¬ 
ences,  but  again  the  evidence  for  such  behavior  in  apes  remains 
inconclusive  (Tomasello  et  al.,  2003;  Silk  et  al.,  2005). 

We  hypothesize  instead  that  cooperation  and  competition 
endemic  to  group  life  favors  the  evolution  of  neural  circuits  tuned 
to  extract  information  about  the  experiences  of  others,  and  that 
these  circuits  serve  as  the  core  building  blocks  for  the  develop¬ 
ment  of  observational  learning  and  other- regarding  behaviors, 
which  reach  their  fullest  expression  in  our  own  species.  As  a  first 
behavioral  test  of  this  idea,  we  probed  the  impact  of  vicarious 
reinforcement  on  subsequent  decisions  made  by  rhesus  macaques 
with  respect  to  other  monkeys.  Rhesus  monkeys  observe  others 
to  gather  social  information  (Cheney  and  Seyfarth,  1990),  dis¬ 
play  sensitivity  to  fictive  outcomes  in  non-social  settings  (Hayden 
et  al.,  2009),  show  rudimentary  understanding  of  the  intentions  of 
others  (Flombaum  and  Santos,  2005),  care  for  kin  (Maestripieri, 
1994),  and  may  give  up  foods  to  alleviate  pain  in  conspecifics 
(Masserman  et  al.,  1964).  We  hypothesized  that  such  behaviors, 
as  well  as  naturally  occurring  behaviors  such  as  social  grooming, 
alliance  formation,  and  group  territorial  defense,  derive  from  fun¬ 
damental  vicarious  reinforcement  mechanisms  similar  to  those 
guiding  social  behavior  in  humans. 

To  test  this  hypothesis,  we  capitalized  on  simple  Pavlovian  and 
instrumental  conditioning  to  associate  liquid  rewards  to  self  and 
rewards  to  another  monkey  with  a  set  of  visual  cues,  and  subse¬ 
quently  tested  for  preferences  amongst  these  cues  in  a  two  alter¬ 
native-forced  choice  task  to  infer  underlying  reward  associations. 
Subsequent  preference  tests  between  cues  revealed  a  preference  to 
reward  the  other  monkey  rather  than  no  one,  but  a  preference  to 
withhold  reward  from  the  other  when  choosing  between  rewarding 
self  or  both  monkeys  simultaneously.  Crucially,  monkeys  showed 
no  preferences  amongst  the  cues  when  the  other  monkey  was 
removed  from  the  room  and  replaced  with  a  juice  collection  bot¬ 
tle,  confirming  the  social  dependence  of  vicarious  reinforcement 
and  thus  ruling  out  simple  fictive  learning  as  an  explanation  for 
the  observed  behavior.  Preferences  amongst  cues  were  predicted  by 
the  relative  subjective  value  of  each  cue,  as  inferred  from  the  time  it 
took  to  initiate  choosing  each  option,  as  well  as  the  frequency  with 
which  the  actor  monkey  looked  at  the  recipient  monkey  following 


choices.  These  findings  demonstrate  context-dependent,  vicarious 
reinforcement  guides  decision-making  with  respect  to  others  in 
rhesus  macaques. 

MATERIALS  AND  METHODS 

GENERAL  PROCEDURES 

All  procedures  were  approved  by  the  Duke  University  Institutional 
Animal  Care  and  Use  Committee  and  were  designed  and  conducted 
in  compliance  with  the  Public  Health  Service’s  Guide  for  the  Care 
and  Use  of  Animals.  All  rhesus  macaques  ( Macaca  mulatto)  used  in 
the  study  were  genetically  unrelated,  middle-ranked  males  (mean 
age  and  SD,  9  ±  3.7),  and  none  of  Ml -M2  pairs  were  cagemates.  All 
monkeys  involved  in  this  study  received  at  least  20  ml/kg  of  liquid 
daily  in  addition  to  fluid  earned  in  the  experiment. 

Horizontal  and  vertical  eye  positions  were  sampled  (1000  Hz)  by 
an  infrared  eye-monitoring  camera  system  (SR  Research  Eyelink). 
Stimuli  were  controlled  by  a  computer  running  Matlab  using 
PsychToolbox  (Brainard,  1997;  Pelli,  1997).  All  experiments  were 
carried  out  in  a  dimly  lit  room  to  ensure  visibility  of  Ml  and  M2. 
Both  Ml  and  M2  were  head-restrained  during  the  experiments.  M2 
was  always  situated  diagonally  across  from  Ml  at  a  45°  eccentricity 
to  the  right  from  the  center  of  Ml’s  screen,  and  they  faced  each 
of  their  own  display  screens,  which  were  located  at  a  90°  angle 
from  one  another  (Figure  1A).  The  location  of  M2  (center  of  the 
face)  was  mapped  empirically  prior  to  experiments  using  Ml’s  eye 
positions.  In  the  chair/juice  control,  an  empty  primate  chair  with 
an  operating  juice  tube  and  a  depository  bottle  replaced  M2.  The 
depository  bottle  was  placed  in  the  same  space  that  would  otherwise 
be  occupied  by  M2’s  mouth  region  (all  else  in  the  control  were 
identical  to  the  M1-M2  condition). 

Solenoid  valves  that  delivered  the  liquid  rewards  were  placed  in 
another  room  to  prevent  monkeys  from  forming  secondary  associa¬ 
tions  between  solenoid  clicks  and  different  reward  types.  We  also 
included  a  separate  solenoid  designated  for  RN0NE  that  only  pro¬ 
duced  clicks  but  delivered  no  fluid.  Masking  white  noise  was  always 
played  in  the  experimental  room.  We  used  a  relatively  large  juice 
reward  size  (0.5-1  ml)  per  successful  trial  in  order  to  clearly  dem¬ 
onstrate  to  Ml  that  M2  received  juice  rewards  on  RB0TH  and  Rother 
trials.  The  reward  size  remained  constant  across  different  reward 
conditions  within  each  block.  More  specifically,  the  fluid-restricted 
actor  and  recipient  monkeys  received,  on  average,  250  ml  of  liquid 
in  the  form  of  cherry  juice.  The  amount  of  fluid  intake  across  dif¬ 
ferent  experimental  sessions  only  fluctuated  within  ~50  ml.  During 
the  days  without  experimental  sessions,  the  monkeys  drank  up  to 
500  ml  ad  lib.  or  more,  which  demonstrates  the  high  motivational 
level.  Furthermore,  they  were  very  motivated  by  this  reinforce¬ 
ment  schedule,  given  that  they  participated  in  the  experiments 
and  continued  to  perform  trials  for  about  2-3  h  without  stopping. 

BEHAVIORAL  TASKS  AND  ANALYSIS 

The  behaviors  from  two  actor  monkeys  were  examined.  The  tasks 
were  initially  developed  for  neurophysiological  investigations,  and 
therefore  we  limited  the  number  of  the  actor  monkeys  to  2,  which 
is  the  standard  and  practical  convention  for  neurophysiological 
studies.  This  convention,  however,  weakens  the  generalizability  of 
the  study.  To  address  this  to  our  best,  the  current  study  also  reports 
the  main  findings  and  statistics  separately  for  the  two  monkeys.  A 
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FIGURE  1  |  Experimental  setup  and  behavioral  paradigms.  (A)  An  actor 
monkey  (Ml )  performed  behavioral  tasks  in  the  presence  of  a  recipient  monkey 
(M2)  in  a  dimly  lit  room.  (B) Typical  stimuli  used  for  the  monkey-monkey 
(Ml -M2)  and  monkey-chair/juice  (Ml -C/J)  conditions.  See  Table  1  for  all  the 


stimuli  used.  (C)  Behavioral  tasks.  Top,  Pavlovian  conditioning  task.  Middle, 
instrumental  conditioning  task.  Bottom,  preference  task.  Pavlovian  and 
instrumental  conditioning  trials  were  randomly  interleaved.  Preference  trials 
were  run  to  test  Ml 's  vicariously  conditioned  preferences. 


total  of  eight  M1-M2  and  two  Ml -chair /juice  pairs  were  used  in 
the  study.  Of  these,  three  M1-M2  pairs  and  two  chair/juice  controls 
(for  each  Ml)  were  subjected  to  both  Pavlovian  and  instrumental 
conditioning  trials  with  a  novel  stimulus  set  for  each  pair.  The 
remaining  five  M1-M2  pairs  were  tested  based  on  already  learned 
cue-reward  associations  from  the  conditioning  trials  (i.e.,  from  the 
three  M1-M2  pairs).  The  complete  set  of  visual  cues  used  is  shown 
in  Table  1.  One  actor  monkey  (MY)  served  as  Ml  first  then  was 
also  tested  as  M2  at  the  very  end,  whereas  the  other  actor  monkey 
(MO)  was  tested  as  M2  at  the  very  beginning,  then  served  as  Ml 
from  then  on  (see  box  in  Figure  3B  for  the  complete  list  of  pair¬ 
ings).  Context-dependent  preferences  were  evident  in  both  Mis 
(see  main  text  for  statistics).  Other  monkeys  involved  in  the  study 
only  served  as  M2. 

The  conditioning  task  consisted  of  randomly  interleaved 
Pavlovian  (Figure  IB,  top)  and  instrumental  conditioning  trials 
(Figure  IB,  middle).  On  both  trial  types,  Ml  initiated  the  trial  by 
shifting  gaze  to  a  central  stimulus  (0.7°  X  0.7°).  After  200  ms  of 
fixation,  a  cue  (5°  X  5°)  of  different  shape  and/or  color  appeared 
in  the  center  and  remained  on  for  Ison  Pavlovian  trials  and  for 
300  ms  on  instrumental  trials.  Visual  cues  on  Pavlovian  trials 


contained  a  white  outline  around  the  same  cues  used  to  convey 
the  same  reward  outcomes  on  instrumental  trials  (Figure  1C). 
On  Pavlovian  trials,  cue  onset  marked  the  end  of  the  fixation 
requirement  (i.e.,  free  to  look  anywhere),  and  the  appropriate 
reward  outcome  was  delivered.  On  instrumental  trials,  however, 
extinction  of  the  cue  was  followed  by  another  200  ms  of  central 
fixation  before  a  white  target  stimulus  (1°  diameter)  appeared  at 
one  of  eight  random  locations  (eccentricity  of  8°).  Ml  had  1.5  s 
to  shift  gaze  to  the  target  with  in  3.4°.  After  successful  target 
acquisition,  the  appropriate  reward  was  delivered.  At  the  onset  of 
reward,  Ml  was  free  to  look  anywhere  in  the  setup  before  the  next 
trial  began  for  1  s.  Rewards  were  delivered  at  approximately  the 
same  time  for  the  Pavlovian  and  instrumental  trials  after  match¬ 
ing  the  reward  timings  of  the  previously  occurred  instrumental 
trials  (requiring  motor  responses)  to  the  subsequent  Pavlovian 
trials  on  a  trial-by-trial  basis.  Data  from  120  ±  57  (median  ±  SD) 
and  173  ±  59  correct  trials  were  collected  for  each  pair  and  the 
non-social  control,  respectively. 

In  the  preference  task  (Figure  1C,  bottom),  Ml  again  began 
each  trial  by  shifting  gaze  to  the  fixation  stimulus.  After  200  ms 
of  central  fixation,  two  of  the  previously  learned  cues  from  the 
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Table  1  |  Stimulus-reward  pairs  used  in  the  experiments. 

Stimulus- 

^SELF 

-reward  associations 

R  R 

BOTH  OTHER 

R 

NONE 

Conditioned  pairs  on  the 
conditioning  trials  (M1-M2) 

Preference  tested  pairs  on  e 
the  preferenctrials  (Ml -M2) 
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MY-MD 
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• 

♦ 

MY-C/J 

MY-C/J 

■ 

■ 

■ 

MO-C/J 

MO-C/J 

Stimuli  used  for  all  individual  monkey-monkey  (e.g.,  MY-MD)  pairs  and  monkey-chair/juice  (e.g.,  MY-C/J)  controls.  On  Pavlovlan  trials,  a  white  outline  was  present 
on  these  cues  (e.g.,  see  Figure  1C). 


conditioning  task  appeared  as  targets  at  two  of  eight  random 
locations  8°  from  the  central  fixation  stimulus,  separated  by  180° 
(e.g.,  Figures  1  A, B,  bottom).  Upon  target  onset,  Ml  shifted  gaze 
to  one  or  the  other  target,  and  the  reward  outcome  associated 
with  that  chosen  target  was  delivered.  Ml  had  1.5  s  to  shift 
gaze  to  the  target  (±3.4°).  Data  from  229  ±  88  and  122  ±  78 
correct  trials  were  collected  for  each  pair  and  the  non-social 
control,  respectively. 

For  both  tasks,  when  an  error  occurred  (i.e.,  failure  to  maintain 
fixation  after  cue  onset  or  inaccurate  gaze  shift  to  the  peripheral  tar¬ 
get),  the  trial  was  aborted,  and  a  white  error  square  (14.2°  X  14.2°) 
appeared  on  the  screen  for  1.5  s.  On  Pavlovian  conditioning  trials, 
errors  were  defined  as  failures  to  maintain  fixation  after  acquiring 
the  fixation  point  to  start  a  trial.  Because  these  errors  were  inde¬ 
pendent  of  any  reward  contingencies  (i.e.,  before  cue  onset),  we  did 
not  consider  them  here.  On  instrumental  conditioning  and  choice 
trials,  errors  were  defined  as  either  failures  to  maintain  fixation  in 
the  beginning  of  a  trial  or  breaking  fixation  or  not  acquiring  a  target 
after  the  reward  contingencies  were  revealed  (after  cue  onset).  In 
practice,  almost  all  errors  resulted  from  monkeys  looking  up  and 
away  from  the  computer  monitor.  Error  trials  were  excluded  from 
further  analyses. 

We  calculated  a  vicarious  reinforcement  index  ( VRI)  by  comput¬ 
ing  the  difference  between  the  frequency  of  choosing  one  option 
(nA)  and  the  other  ( nB )  and  then  normalizing  the  difference  by 
the  sum: 

VRI  -  ~ (t) 

nA+nB 

In  the  Self/Both  context,  nA  and  n D  were  the  number  of  R^^  and 
Rself  choices,  respectively,  whereas  in  the  Other/None  context, 
they  were  Rother  and  RN0NE>  respectively.  The  VRI  always  ranged 
from  -1  to  1,  with  1  corresponding  to  Ml  always  choosing  the 
prosocial  option  (either  RBOTH  or  Rother)>  -1  corresponding  to 
Ml  always  choosing  the  non-prosocial  option  (either  RSELE  or 
RNonE)>  and  0  corresponding  to  Ml  choosing  each  of  the  alterna¬ 
tives  equally  often. 

Saccade  reaction  times  (RTs;  time  from  target  onset  to  move¬ 
ment  onset)  were  computed  using  a  20°/s  velocity  crossing  thresh¬ 
old  on  each  trial.  The  frequency  of  M 1  looking  at  M2  was  computed 
by  counting  the  number  of  gaze  shifts  made  by  Ml  into  a  25°  X  25° 


window  spanning  from  the  center  of  M2’s  face  during  the  peri- 
reward  free-viewing  period  (from  the  start  of  reward  delivery  up 
to  1  s  after  reward  the  completion  of  the  delivery;  Figure  IB).  On 
non-social  control  trials,  this  region  was  occupied  by  an  operating 
juice  tube  and  a  depository  bottle  situated  in  the  neckplate  of  our 
primate  chairs. 

RESULTS 

MONKEYS  EXHIBIT  VICARIOUS  REINFORCEMENT 

Two  adult  male  rhesus  monkeys  served  as  actors  (Ml )  and  five  adult 
male  rhesus  monkeys  served  as  recipients  (M2;  see  Materials  and 
Methods).  Ml  and  M2  sat  across  from  each  other  (Figure  1A),  and 
each  viewed  his  own  computer  screen,  which  displayed  visual  cues. 
On  Pavlovian  trials  (Figure  IB,  top),  Ml  and  M2  both  saw  the  same 
cue  at  the  center  of  the  display,  and  juice  rewards  were  delivered 
to  Ml  (Rsele),  M2  (Rother)>  both  Ml  and  M2  (RB0TH)>  or  neither 
(Rnone)  depending  on  the  color  or  shape  of  the  cue  (Figure  1C; 
Table  1).  On  instrumental  trials  (Figure  IB,  middle),  Ml  and  M2 
again  both  saw  the  same  cue  and  a  neutral  target  appeared,  to  which 
M 1  had  to  shift  gaze  for  subsequent  delivery  of  juice  reward  to  M 1 , 
M2,  both  Ml  and  M2,  or  neither. 

Error  rates  (failure  to  maintain  fixation  after  cue  onset  or 
inaccurate  gaze  shift  to  the  peripheral  target;  see  Materials  and 
Methods)  on  the  instrumental  trials  demonstrated  that  Ml 
discriminated  among  the  four  reward  conditions  (Figure  2A). 
For  both  instrumental  conditions  in  which  Ml  received  direct 
fluid  reward,  error  rates  were  indistinguishable  whether  or  not 
M2  was  also  rewarded  (RSELE  and  RB0TH;  p  =  0.54,  Wilcoxon  sign 
rank  test;  n  =  57  sessions;  Figure  2A).  In  contrast,  Ml  made 
significantly  more  errors  on  trials  with  cues  that  did  not  result 
in  direct  fluid  reward  to  self  compared  with  trials  displaying 
cues  that  predicted  direct  fluid  reward  to  self  (RSELE  or  RB0TH 
versus  Rother  or  RNONE;  all  p  <  0.00001,  Wilcoxon  sign  rank  test; 
n  =  57  sessions;  Figure  2A). 

Notably,  Ml  continued  to  perform  instrumental  trials  with 
cues  predicting  reward  to  M2  (Rother)  or  no  one  (RN0NE)  despite 
the  fact  that  he  was  never  rewarded  in  either  case  and  error 
rates  clearly  showed  that  Ml  did  not  prefer  these  cues  [error 
rates:  71.0  ±  4.3%  (mean  ±  SEM  per  session)  and  84.8  ±  2.8%, 
respectively] .  Critically,  Ml  made  significantly  fewer  errors  when 
the  cue  predicted  a  fluid  reward  for  M2  compared  with  when 
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the  cue  predicted  no  one  would  receive  a  fluid  reward,  indicat¬ 
ing  a  reinforcing  property  to  observing  M2  receive  a  reward 
(p  <  0.00001,  Wilcoxon  sign  rank  test;  n  =  57;  Figure  2A).  This 
pattern  of  systematically  lower  error  rates  on  Rother  compared  to 
RNOne  was  a^so  evident  in  each  Ml  individually  (both  p  <  0.005, 
Wilcoxon  sign  rank  test;  n-  44  for  MY  and  13  sessions  for  MO). 
In  contrast,  in  a  non-social  control  in  which  M2  was  replaced 
with  a  collecting  bottle  (chair/juice  control;  see  Materials  and 
Methods),  the  error  rates  for  responding  to  cues  predicting 
reward  to  other  (REVrLJcn)  and  reward  to  no  one  (Rxrr^TJ  were 

v  OTHER7  v  NONE7 

statistically  indistinguishable  (p  =  0.20,  Wilcoxon  sign  rank  test; 
n  =  9  sessions;  Figure  2B). 

The  presence  of  another  monkey  clearly  influenced  error 
rates  during  conditioning.  Ml  made  fewer  errors  overall  in  the 
non-social  control  compared  to  the  social  trials  (total  error  rates: 
15.5  ±3.8%  versus  47.6  ±2.5%, p<  0.00001,  Wilcoxon  rank  sum  test; 
n  =  57;  Figure  2A).  The  higher  error  rates  on  the  social  compared 
to  the  non-social  control  trials  could  be  attributed  to  increased 
attentional  demands  due  to  the  presence  of  another  monkey  (e.g., 
bystander  effect).  Error  rates  during  the  conditioning  trials  dem¬ 
onstrate  that  rhesus  monkeys  value  rewards  to  self  more  than  they 
value  rewards  to  others,  as  expected.  Nonetheless,  the  fact  that  Ml 
continued  to  participate  when  only  M2  was  rewarded  directly  with 
juice  suggests  that  observing  another  monkey  receive  a  reward  is 
vicariously  reinforcing. 
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FIGURE  2  |  Error  patterns  during  instrumental  conditioning  demonstrate 
rhesus  macaques  are  sensitive  to  other's  rewards.  (A)  Error  rates 
(excluding  first  sessions)  on  the  instrumental  conditioning  trials  (mean  of 
sessions  ±  SEM)  in  M1-M2  conditions  (n  =  57  sessions).  (B)  Error  rates  in  the 
non-social  (M1-C/J)  controls  (n  =  9  sessions).  Same  format  as  in  (A). 
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CONTEXT-DEPENDENT  MANIFESTATION  OF  VICARIOUS 
REINFORCEMENT 

Subsequently,  we  used  a  two  alternative  forced  task  (preference  task; 
Figure  IB  bottom)  to  directly  test  the  hypothesis  that  observing 
another  monkey  receiving  a  reward  is  vicariously  reinforcing.  In  the 
preference  task,  Ml  chose  between  pairs  of  previously  conditioned 
cues  (Rself  versus  RB0TH,  or  Rother  versus  RN0NE)  by  shifting  gaze  to 
one  of  them.  Critically,  rewards  were  matched  between  the  available 
choices  in  each  condition  -  that  is,  Ml  chose  between  Rother  and 
RNOne  [Other/None  condition  (purely  vicarious  context);  Ml  never 
directly  rewarded  with  juice]  or  between  RB0TH  and  RSELF  (Self/Both 
condition;  Ml  always  rewarded  with  juice).  We  hypothesized  that 
cues  would  acquire  value  vicariously  via  Pavlovian  and  instrumen¬ 
tal  conditioning,  and  that  differential  cue  values  would  be  expressed 
as  systematic  preferences  in  this  choice  task. 

As  expected,  error  rates  in  the  preference  task  were  consistent 
with  a  preference  for  receiving  direct  fluid  reward  in  the  Self/Both 
condition  (error  rate:  0.8  ±  0.2%;  n  =  64  sessions),  compared 
to  no  fluid  reward  in  the  Other/None  condition,  in  which  Ml 
was  never  rewarded  (12.6  ±  1.8%;  p  <  0.00001,  Wilcoxon  sign 
rank  test;  n  =  64).  Remarkably,  however,  Ml  performed  about 
88%  of  trials  in  which  he  was  not  directly  rewarded  with  fluid. 
Again,  as  in  the  conditioning  trials,  Ml  made  significantly  fewer 
errors  during  the  non-social  control  (n  =  13  sessions)  compared 
to  when  M2  was  present  (p  <  0.001,  Wilcoxon  rank  sum  test) .  M 1 
was  significantly  more  willing  to  complete  trials  which  resulted 
in  no  reward  to  M 1  during  the  preference  trials  compared  to  the 
Pavlovian  conditioning  trials  (correct  rate:  87.4  ±  1.8%  versus 
22.1  ±  2.7%,  p  <  0.00001,  Wilcoxon  rank  sum  test).  This  is  con¬ 
sistent  with  prior  observations  in  rhesus  macaques  that  voluntary 
choices  are  more  motivating  than  simple  operant  responses  in  the 
conditioning  tasks  (Suzuki,  1999). 

The  critical  question  was  whether  Ml  acquired  an  intrinsi¬ 
cally  rewarding  preference,  through  vicarious  reinforcement, 
for  rewarding  M2  in  the  absence  of  rewarding  self  (Rother)-  The 
choice  preferences  of  Ml  demonstrated  that  cues  indeed  acquired 
strong  motivational  associations  even  when  Ml  received  no  direct 
reward.  Ml  consistently  preferred  Rother  (82.5  ±  1.1%)  over  RN0NE 
(17.5%;  p  <  0.00001,  Wilcoxon  signed  rank  test;  n  =  64  sessions), 
even  though  Ml  was  never  directly  rewarded  with  juice  in  this 
context  (Figure  3A).  Critically,  this  preference  was  absent  in  the 
non- social  control  when  M2  was  removed  from  the  experimental 
room  and  replaced  by  an  operating  juice  tube  and  a  collection  bot¬ 
tle  [Figure  3A;  54.7  ±  3.8  (Rother)  versus  45.3%  (R NONE),  p  =  0.17, 
Wilcoxon  sign  rank  test;  n  -  13].  In  contrast,  in  the  Self/Both 
context,  Ml  consistently  preferred  RSELF  (80.3  ±  1.0%)  over  RB0TH 
( 19.7%;  p<  0.00001,  Wilcoxon  sign  rank  test;  n  =  64),  even  though 
either  choice  led  to  the  same  physical  juice  reward  for  Ml  simul¬ 
taneously  (Figure  3A).  This  pattern  was  again  absent  in  the  non¬ 
social  control  [Figure  3A;  48.3  ±1.3  (RSEEF)  versus  51.7%  (RBQTH), 
p  =  0.06,  Wilcoxon  sign  rank  test;  n-  13].  We  observed  the  context- 
dependent  patterns  of  behavior  in  each  Ml  separately  [percentage 
of  choosing  Rother  (MY  and  MO):  86.7  ±1.5  and  80.7  ±  1.4%; 
percentage  of  choosing  RSELF:  85.4  ±  1.6  and  79.6  ±  1.1%]. 

We  further  quantified  Ml’s  preferences  by  calculating  a  VRI,  a 
contrast  ratio  varying  from  -1  to  1,  with  positive  values  indicat¬ 
ing  preferences  for  Rother  over  RNONE  (Other/None  condition) 
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FIGURE  3  |  Context-dependent  vicarious  reinforcement  drives  the 
expression  of  other-regarding  preferences  in  rhesus  macaques.  (A)  Choice 
preferences  (median  of  all  sessions  ±  SEM)  in  the  Other/None  and  Self/Both 
contexts  across  Ml -M2  pairs  (8  pairs,  64  session)  and  Ml -chair/juice  controls 
(2  pairs,  13  sessions).  (B)  Choice  preferences  expressed  as  VR  indices 
(median  of  all  sessions  ±  SEM)  in  the  Other/None  and  Self/Both  contexts 
across  M1-M2  pairs  (see  box  for  individual  pair  medians  and  standard 
deviations  (SDs)  for  their  ranges)  and  Ml-chair/juice  controls.  Bars  are 
color-coded  by  the  partner  type  in  both  panels  (see  box  in  A). 


SOCIAL  VARIABLES  INFLUENCE  VICARIOUS  REINFORCEMENT 

The  magnitudes  of  the  VRI  were  idiosyncratic  to  individual 
pairs  of  monkeys.  Such  differences  were  apparent  from  the  very 
beginning  of  testing  and  remained  more  or  less  stable  (Figure  4). 
We  tested  whether  a  specific  social  variable  could  explain  this 
individual  variability.  First,  we  examined  social  status,  which  is 
known  to  influence  social  behaviors  in  both  young  children  and 
non-human  animals  (Hawley,  1999),  and  observational  learning 
has  been  implicated  in  how  monkeys  acquire  social  hierarchical 
information  (Cheney  and  Seyfarth,  1990).  We  found  that  Ml  was 
more  willing  to  share  reward  if  Ml  was  dominant  to  M2  in  the 
Self/Both  context  (n  =  4  out  of  8).  Specifically,  Ml  was  more  likely 
to  choose  Rboth  in  the  Self/Both  context  [VRI:  -0.54  ±  0.03  (Ml 
is  dominant)  versus  -0.65  ±  0.03  (Ml  is  subordinate),  p  <  0.01, 
Wilcoxon  rank  sum  test] ,  but  not  necessarily  Rother  in  the  Other/ 
None  context  (0.62  ±  0.02  versus  0.58  ±  0.04,  p  —  0.57,  Wilcoxon 
rank  sum  test),  if  Ml  is  dominant  to  M2. 

Second,  we  examined  whether  the  familiarity  of  individuals  in 
each  pair  biased  choices  by  analyzing  the  housing  locations  of  Ml 
relative  to  M2  in  the  colony  room,  which  served  as  our  measure  of 
familiarity.  It  has  been  documented  that  social  interaction  behav¬ 
iors  increase  with  familiarity  in  both  humans  and  monkeys  (Preston 
and  de  Waal,  2002).  We  therefore  reasoned  that  monkeys  who  could 
directly  view  each  other  (housed  on  opposite  sides,  compared  to  on 
same  sides)  would  be  more  familiar  and  thus  more  likely  to  reward 
others.  We  found  that  VRI  in  the  Other/None  context  was  higher  if 
Ml  and  M2  were  housed  on  opposite  sides  (yi  -  4  out  of  7)  of  the 
colony  room,  with  direct  visual  access  to  each  other.  That  is,  M 1  was 
more  likely  to  choose  Rother  in  the  Other/None  context  [0.71  ±  0.02 
(opposite  side)  versus  0.53  ± 0.03  (same  side),p<  0.0001,  Wilcoxon 
rank  sum  test] ,  but  not  necessarily  RB0TH  in  the  Self/Both  context 
(-0.60  +  0.03  versus  -0.57±0.02,p  =  0.19,  Wilcoxon  rank  sum  test), 
if  he  could  see  him  while  in  his  home  cage.  Together,  these  find¬ 
ings  suggest  that  individual  variability  in  vicarious  reinforcement 
(Figures  3B  and  4)  is  at  least  partially  influenced  by  both  social 
dominance  and  social  familiarity,  although  our  limited  sample  size 
and  types  preclude  strong  conclusions. 


or  Rboth  over  RSELF  (Self/Both  condition)  and  0  indicating  indif¬ 
ference  (see  Materials  and  Methods).  Analysis  of  the  index  led 
to  similar  results.  In  the  Other/None  condition,  Ml  preferred 
to  reward  M2  (VRI:  0.60  ±  0.02,  significantly  different  from  0, 
p<  0.00001,  Wilcoxon  sign  rank  test;  n-  64  sessions;  Figure  3B), 
and  this  pattern  was  absent  in  the  non-social  control  (0. 1 1  ±  0.08, 
p  -  0.18,  Wilcoxon  sign  rank  test;  n  -  13;  Figure  3B).  In  the 
Self/Both  context,  however,  Ml  preferred  to  withhold  reward 
from  M2  (-0.58  ±  0.02,  p  <  0.00001,  Wilcoxon  sign  rank  test; 
Figure  3B),  and  this  pattern  was  only  weakly  evident  in  the  non¬ 
social  control  (0.06  ±  0.03,  p  -  0.06,  Wilcoxon  sign  rank  test; 
Figure  3B).  Again,  we  observed  the  same  pattern  in  each  Ml 
separately  [Other/None  context  (MY  and  MO):  0.70  ±  0.03  and 
0.56  ±  0.03;  Self/Both  context:  -0.65  ±  0.03  and  -0.55  ±  0.02;  all 
p  <  0.0000 1 ,  Wilcoxon  sign  rank  test;  n-  19  and  45,  respectively] . 
These  preferences  remained  stable  over  the  course  of  data  col¬ 
lection  (Figure  4).  Crucially,  the  VRI  indices  in  the  Other/None 
and  Self/Both  contexts  never  crossed  over. 


MONKEYS  OBSERVE  THE  REWARDING  EVENTS  OF  OTHERS 

After  monkeys  expressed  their  choice,  they  were  permitted  to  freely 
look  about  (Figure  IB).  During  this  free-viewing  period,  Ml  often 
shifted  gaze  toward  the  face  of  M2,  and  the  overall  rate  of  shift¬ 
ing  gaze  depended  on  the  reward  outcome  for  Ml  [Figure  5 A; 
20.5  ±  3.6%  (median  ±  SEM  of  the  average  between  Rother  and 
RNOne)  versus  3.0  ±  2.5%  (Rself  and  RB0TH)>  p  <  0.00001,  Wilcoxon 
sign  rank  test;  n  =  64  sessions].  Critically,  however,  Ml  looked  at 
M2  more  frequently  after  choosing  to  reward  him  over  no  one  in 
the  Other/None  condition  (Rother>  25.4  ±  4.1%;  RNONE>  16.7  ±  5.8%, 
p  <  0.0005,  Wilcoxon  sign  rank  test).  We  found  a  significant  effect 
(frequency  of  gaze  after  choosing  Rother  >  RN0NF)  Mis 

separately  [MY:  23.1  ±  1.6  versus  15.4  ±  2.3%  (p  <  0.005;  n  =  19); 
MO:  26.1  ±  5.6  versus  17.9  ±  8.1%  (p  <  0.01;  n  =  45),  Wilcoxon 
sign  rank  test] .  Thus,  our  observation  confirms  that  there  is  a  link 
between  social  attention  and  vicarious  reinforcement. 

By  contrast,  in  the  non- social  control  (n  -  13  sessions),  looking 
behavior  was  greatly  reduced  across  all  reward  outcomes,  compared 
to  the  social  conditions  (RSELE>  Rother-  p<0.01;RBOTH:p  =  0.12;RN 
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FIGURE  4  |Temporal  progression  (moving  200-trial  bins  with  a  step  size  of  100  trials)  of  context-dependent  VR  indices  for  individual  M1-M2  pairs  (8  pairs, 
64  sessions)  and  M1-C/J  pairs  (2  pairs,  13  sessions;  see  box  for  pair  identities).  Data  points  on  the  right  show  individual  pair  medians  and  SDs  across  all  trial 
bins  for  each  pair. 


p-  0.01,  Wilcoxon  rank  sum  test;  Figure  5 A) .  Critically,  M 1  neither 
looked  at  the  juice  bottle  more  often  after  choosing  Rother  over 
Rnone  in  the  Other/None  condition  (p  =  0.34,  Wilcoxon  sign  rank 
test),  nor  after  choosing  RB0TH  over  RSELF  in  the  Self/Both  condition 
(p  =  0.94,  Wilcoxon  sign  rank  test).  The  only  factor  that  explained 
looking  behavior  in  the  non-social  control  was  whether  or  not  Ml 
was  directly  rewarded  with  juice  (RSELF  and  RB0TH  versus  Rother  and 
Rnone5  P  <  0-0005,  Wilcoxon  rank  sum  test).  Thus,  reward  consump¬ 
tion  by  another  monkey  strongly  recruits  attention  in  the  absence 
of  direct  reward  to  self,  suggesting  vicarious  reinforcement  maybe 
mediated  by  social  attention  circuits  in  the  brain  (Klein  et  al.,  2009). 

SACCADE  REACTION  TIMES  REVEAL  THE  INTERNAL  DELIBERATIVE 
PROCESS 

The  pattern  of  saccade  RTs  on  choice  trials  further  corroborates 
the  hypothesis  that  rewarding  self  was  more  valued  than  any  other 
alternatives  (Figure  5B;  RTs  for  RSELF  <  RB0TH  <  Rother  <  RNONE;  all 
comparisons  p  <  0.00001,  Wilcoxon  sign  rank;  n  -  64  sessions). 
Generally,  Ml  responded  more  quickly  whenever  he  chose  to 
directly  reward  himself  with  juice.  Nonetheless,  Ml  responded 
faster  when  he  chose  to  reward  M2  than  when  he  chose  to  reward 
no  one  at  all.  These  results  were  obtained  for  each  Ml  separately  (all 
comparisons  p  <  0.01  for  each  Ml,  Wilcoxon  sign  rank  test;  n  -  19 
and  45  sessions  for  MY  and  MO,  respectively).  Importantly,  in  the 
absence  of  M2  (non-social  control;  n-  13  sessions),  RTs  across  dif¬ 
ferent  reward  outcomes  remained  more  or  less  flat  (Figure  5A).  RTs 
were  indeed  slower  overall  in  the  presence  of  M2,  perhaps  due  to 
an  additional  attentional  load  induced  by  the  presence  of  M2  (blue 
versus  red  traces  in  Figure  5B;  all  comparisons  p  <  0.005,  except 
p  -  0.46  for  Rsele  conditions,  Wilcoxon  rank  sum  test). 

Given  that  monkeys  generally  respond  more  slowly  when  they 
anticipate  smaller  rewards  (Kawagoe  et  al.,  1998;  Roesch  and  Olson, 
2004),  we  inferred  the  subjective  reward  value  of  the  four  condi- 
tions  to  be  RSELF  >  RB0TH  and  Rother  >  RK0NK.  These  inferred  sub- 
jective  reward  values,  which  were  absent  in  the  non- social  control 
(Figure  5B),  predict  the  relative  preferences  between  cues  observed 
in  the  preference  task  (Figure  3).  Specifically,  Ml  chose  RSELE  over 


Rboth  in  the  Self/Both  condition  and  showed  faster  RT  for  choos¬ 
ing  Rseee,  whereas  Ml  chose  Rother  over  RN0NE  in  the  Other/None 
condition  and  showed  faster  RT  for  choosing  Rother. 

DISCUSSION 

We  demonstrated  that  social  preferences  of  rhesus  macaques  -  non¬ 
human  primates  that  live  in  large,  hierarchical,  mixed-sex  social 
groups  and  who  last  shared  a  common  ancestor  with  humans  some 
25  million  years  ago  -  could  be  shaped  by  vicarious  reinforcement 
in  a  context- specific  manner.  Monkeys  systematically  preferred  to 
provide  juice  reward  to  others  rather  than  to  no  one,  as  if  observing 
others  drink  is  vicariously  rewarding.  In  contrast,  monkeys  system¬ 
atically  withheld  reward  from  others  when  confronted  with  the 
options  to  either  consume  reward  alone  or  share  reward.  Increased 
social  attention  to  M2  (i.e.,  the  increased  rate  of  gaze  shift  to  M2) 
in  the  Other/None  context  corroborates  enhanced  vicarious  rein¬ 
forcement  during  social  decision-making. 

Rewarding  the  other  monkey  without  any  opportunity  to  reward 
self  is  a  uniquely  vicarious  form  of  reward.  Such  vicarious  rein¬ 
forcement  may  be  driven  by  an  intrinsic  tendency  to  observe  the 
experience  of  others  to  gather  information,  as  can  occur  in  foraging 
(Cheney  and  Seyfarth,  1 990;  Valone  and  Templeton,  2002) .  It  is  pos¬ 
sible,  however,  that  monkeys  simply  find  feedback  to  their  actions 
intrinsically  rewarding.  For  instance,  choosing  to  reward  others  in 
the  Other/None  context  is  the  only  option  that  results  in  a  salient 
feedback  that  could  serve  as  a  secondary  reinforcer  or  confirmation 
that  a  chosen  action  has  resulted  in  a  noticeable  change  in  the  envi¬ 
ronment.  However,  the  preference  to  reward  only  self  in  the  Self/ 
Both  context  makes  this  possibility  less  likely  (although  the  actor 
monkeys  may  have  been  less  interested  in  the  other  monkeys  due 
to  receiving  reward  or  the  competitiveness  evoked  by  this  context), 
since  choosing  to  reward  both  would  also  result  in  salient  feedback. 
Furthermore,  the  absence  of  preference  in  the  non-social  control 
trials  indicates  that  mere  actions  that  result  in  fluid  delivery  are 
not  sufficient  to  drive  vicarious  reinforcement,  suggesting  that  the 
presence  of  a  social  agent  is  required.  Notably,  however,  monkeys 
still  showed  high  error  rates  (71%)  in  the  conditioning  trials  when 
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FIGURE  5  |  Gaze  behavior  reflects  the  internal  social  deliberation 
process.  (A) The  frequency  of  gaze  shifts  (%;  median  ±  SEM  of  individual 
sessions;  64  Ml -M2  sessions,  and  13  M1-C/J  sessions)  toward  the  face 
region  of  M2  (or  toward  the  juice  tube  and  the  bottle  on  control  trials)  during 
the  free-viewing  period  (after  choosing  a  reward  option)  of  the  preference 
task.  (B)  Saccade  reaction  times  (RTs;  median  ±  SEM  of  individual  sessions; 
64  Ml -M2  pairs,  and  13  M1-C/J  pairs)  for  choosing  different  reward 
outcomes  in  the  choice  task.  Asterisks  indicate  significance  in  (A,B): 

*p<  0.05;  **p<  0.005  by  Wilcoxon  sign  rank  test  across  same  partner  types, 
and  Wilcoxon  rank  sum  test  across  different  partner  types.  Dashed  vertical 
lines  distinguish  Self/Both  and  Other/None  contexts. 


the  visual  cues  predicted  reward  to  other  monkey  only.  Interestingly, 
the  error  rates  were  much  lower  (<13%)  when  monkeys  confronted 
a  choice  between  Other/None  in  the  preference  task.  This  is  con¬ 
sistent  with  observations  that  rhesus  macaques  are  much  more 
motivated  when  making  voluntary  choices  compared  to  making 
simple  operant  responses  (Suzuki,  1999).  Still,  the  atypically  large 
error  rates  observed  in  the  conditioning  trials  seems  to  be  consistent 
with  the  competitiveness  of  rhesus  macaques,  and  may  highlight 
differences  between  humans  and  rhesus  macaques  (also  see  below). 

In  contrast,  any  of  the  two  available  options  from  the  Self/Both 
context  results  in  direct  fluid  reward.  The  preference  to  withhold 
reward  from  others  in  this  particular  context  may  reflect  a  poten¬ 
tial  diminishment  of  reward  during  simultaneous  consumption, 
possibly  due  to  the  uncertainty  of  the  quantity  or  quality  of 
reward  delivered  to  others.  Reward  withholding  behavior  may 
also  arise  from  rhesus  monkeys5  natural  competitive  tendencies 


(Anderson  and  Mason,  1978).  For  instance,  from  an  ecological 
standpoint,  sharing  food  with  other  individuals  always  reduces 
the  amount  of  potential  food  available  to  oneself.  Moreover, 
reduced  rates  of  attending  to  M2  in  the  Self/Both  context  may 
further  mitigate  vicarious  reinforcement  during  social  decision¬ 
making.  We  observed  a  small  but  significant  tendency  to  with¬ 
hold  less  if  actor  monkeys  were  dominant  to  recipient  monkeys, 
although  our  limited  sample  size  and  types  preclude  strong  con¬ 
clusions.  This  is  consistent  with  a  recent  study  in  long-tailed 
macaques  (M.  fascicularis)  showing  that  dominant  macaques 
are  more  “prosocial”  toward  subordinates  (Massen  et  al.,  2010). 
Dominant  monkeys  might  be  more  likely  to  engage  in  such  posi¬ 
tive  other-regarding  behaviors  to  sustain  their  rank  and  promote 
group  cohesion,  especially  when  there  is  no  added  cost,  as  in 
the  Self/Both  context.  By  extension,  we  would  predict  humans 
to  choose  to  reward  both  individuals  in  an  analogous  monetary 
version  of  the  Self/Both  context,  as  long  as  the  monetary  reward 
was  the  same  for  both  individuals  and  the  amount  of  reward 
was  undiminished  by  sharing  (i.e.,  non-competitive  situation). 
If  the  monkeys  were  clearly  aware  that  they  both  always  received 
the  same  amount  of  juice  with  an  infinite  amount  of  resources, 
they  might  also  increase  preferences  to  reward  both  monkeys. 
Alternatively,  it  is  also  plausible  that  the  rhesus  macaques,  unlike 
humans,  have  a  difficult  time  in  ignoring  their  naturally  com¬ 
petitive  cognitive  set. 

It  is  critical  to  emphasize  the  dramatic  differences  in  pref¬ 
erences  between  Self/Both  and  Other/None  contexts.  If  the 
actor  monkeys  always  found  it  valuable  to  reward  the  recipient 
monkey,  then  we  would  have  expected  the  monkeys  to  prefer 
to  reward  both  in  the  Self/Both  context.  Alternatively,  if  the 
monkeys  always  found  rewards  delivered  to  the  other  monkey 
to  be  aversive,  perhaps  due  to  perceived  competition,  then  we 
would  have  expected  the  monkeys  to  prefer  to  reward  none  in 
the  Other/None  context.  Instead,  we  observed  a  clean  disso¬ 
ciation  of  preferences  depending  on  social  context,  suggesting 
that  different  reward  contingencies  strongly  influenced  deci¬ 
sions.  This  is  consistent  with  our  findings  that  RTs,  frequency 
of  attention  directed  to  the  other  monkey,  and  error  rates  were 
clearly  different  between  choosing  to  reward  both  and  choosing 
to  only  reward  other.  (Please  also  see  our  response  above  for 
situation-specific  social  behaviors  in  humans  and  monkeys.)  The 
behavioral  and  neural  mechanisms  responsible  for  such  context- 
dependent  social  decision-making  would  provide  new  insights 
into  the  social  flexibility  characterizing  the  behavior  of  macaques 
and  other  primates,  including  humans. 

We  hypothesize  that  vicarious  experiences  are  processed  as 
rewarding  signals  in  the  brain,  and  are  mediated  by  neurons  in 
homologous  circuits  governing  social  perception  and  reward  learn¬ 
ing  in  non-human  primates  and  humans  (Bandura  and  Rosenthal, 
1966;  Fehr  and  Camerer,  2007;  Lohrenz  et  al.,  2007;  Lee,  2008; 
Hayden  et  al.,  2009;  Mobbs  et  al.,  2009).  One  plausible  mechanism 
is  that  the  overlapping  populations  of  neurons  respond  both  to 
rewards  to  self  and  rewards  to  another  individual.  Such  vicari¬ 
ous  reward  could  motivate  social  interactions  as  well  as  underlie 
observational  learning  and  mutualistic  behaviors  such  as  alli¬ 
ance  formation,  social  grooming,  and  group  cohesion  (Fehr  and 
Fischbacher,  2003;  Takahashi  et  al.,  2009).  Modulation  of  vicarious 
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reward  signals  by  social  variables  such  as  dominance  or  familiarity 
could  further  provide  a  mechanism  promoting  socially  adaptive 
behavior  toward  specific  individuals. 

Observing  rewarding  events  of  others  has  been  shown  to 
systematically  and  effectively  modulate  neural  activity  in  classic 
reward  areas  in  humans,  including  ventral  striatum,  ventromedial 
prefrontal  cortex,  and  anterior  cingulate  cortex  (Mobbs  et  al., 
2009;  Lombardo  et  al.,  2010).  Moreover,  the  anterior  cingulate 
cortex  has  been  implicated  in  evaluating  social  information  with 
respect  to  others  (Takahashi  et  al.,  2009).  Dorsolateral  and  ven¬ 
tromedial  prefrontal  cortices  in  humans  have  been  implicated 
in  observing  an  action  and  observing  reward  outcome  of  oth¬ 
ers,  respectively  (Burke  et  al.,  2010).  Observational  fear  condi¬ 
tioning  in  mice  depends  on  affective  pain  circuitry  including 
anterior  cingulated  cortex  (Jeon  et  al.,  2010).  Activation  of  these 
neural  circuits  by  vicarious  outcomes  may  be  the  neural  sub¬ 
strate  that  ultimately  promotes  empathy  and  altruism,  as  well  as 
observational  learning. 


These  findings  suggest  that  vicarious  reinforcement  is  rooted  in 
fundamental  cognitive  mechanisms  that  evolved  early  in  the  primate 
clade.  Throughout  primate  evolution,  vicarious  reinforcement  may 
have  served  as  a  core  building  block  for  complex  social  behaviors 
such  as  cooperation  and  competition,  while  facilitating  observational 
learning  and  group  coordination.  We  also  note  that  our  experimental 
design  provides  a  powerful  tool  for  exploring  the  neural  mechanisms 
underlying  social  learning  and  decision-making  and  thus  will  be  of 
use  to  comparative  psychologists  and  neuroscientists  alike. 
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Neuronal  reference  frames  for  social  decisions  in 
primate  frontal  cortex 

Steve  W  C  Chang1,2,  Jean-Fran^ois  Gariepy2  &  Michael  L  Platt1-3 

Social  decisions  are  crucial  for  the  success  of  individuals  and  the  groups  that  they  comprise.  Group  members  respond  vicariously 
to  benefits  obtained  by  others,  and  impairments  in  this  capacity  contribute  to  neuropsychiatric  disorders  such  as  autism  and 
sociopathy.  We  examined  the  manner  in  which  neurons  in  three  frontal  cortical  areas  encoded  the  outcomes  of  social  decisions  as 
monkeys  performed  a  reward-allocation  task.  Neurons  in  the  orbitofrontal  cortex  (OFC)  predominantly  encoded  rewards  that  were 
delivered  to  oneself.  Neurons  in  the  anterior  cingulate  gyrus  (ACCg)  encoded  reward  allocations  to  the  other  monkey,  to  oneself 
or  to  both.  Neurons  in  the  anterior  cingulate  sulcus  (ACCs)  signaled  reward  allocations  to  the  other  monkey  or  to  no  one. 

In  this  network  of  received  (OFC)  and  foregone  (ACCs)  reward  signaling,  ACCg  emerged  as  an  important  nexus  for  the  computation 
of  shared  experience  and  social  reward.  Individual  and  species-specific  variations  in  social  decision-making  might  result  from  the 
relative  activation  and  influence  of  these  areas. 


Social  cohesion  depends  on  vicarious  identification  with  members 
of  ones  group.  In  social  situations,  we  are  aware  of  our  actions  and 
their  consequences,  but  also  consider  those  of  others,  especially  those 
with  whom  we  might  interact1.  We  also  estimate  the  internal  states 
of  others,  perhaps  by  simulation2,  which  in  turn  shapes  our  future 
actions.  Social  situations  can  drive  observational  learning3,  and  other- 
regarding  preferences  influence  neural  computations  that  ultimately 
result  in  cooperation,  altruism  or  spite4,5.  Disruptions  of  neural  cir¬ 
cuits  involved  in  other- regarding  processes  may  underlie  social  deficits 
attending  neuropsychiatric  conditions  like  autism6.  Human  imaging 
and  clinical  studies  have  found  critical  links  between  social  deficits  and 
abnormal  brain  activity  in  frontal  cortex  and  its  subcortical  targets7. 

Neural  circuits  involved  in  reinforcement  learning  and  decision¬ 
making  are  crucial  for  normal  social  interactions8.  Critical  nodes 
include  ACC9-11,  the  OFC12-17  and  subcortical  areas,  such  as  the 
dopaminergic  ventral  tegmental  area,  substantia  nigra18,19,  the  stria¬ 
tum20,21,  the  lateral  habenula22  and  the  amygdala23.  Neuroimaging 
studies  in  humans  report  activation  of  some  of  these  areas  by  both 
giving  rewards  and  receiving  rewards24-28,  and  lesions  to  some  of 
these  areas  result  in  impaired  social  decision-making7.  These  findings 
suggest  that  a  generic  circuit  for  reward-guided  learning  and  decision¬ 
making  mediates  social  decisions8.  Despite  this  evidence,  and  the 
clear  clinical  relevance  of  understanding  the  neurobiology  of  social 
decision-making,  precisely  how  neurons  in  any  of  these  areas  com¬ 
pute  social  decisions  remains  unknown,  largely  because  of  difficulties 
in  implementing  social  interactions  while  simultaneously  studying 
neuronal  activity  and  controlling  contextual  variables.  Single-unit 
recording  studies  in  nonhuman  animals,  such  as  macaques,  making 
social  decisions  of  similar  complexity  to  those  made  by  humans  would 
help  to  address  this  gap. 


We  implemented  a  reward-allocation  task  in  pairs  of  rhesus 
macaques  while  recording  from  single  neurons  in  three  critical  nodes 
in  the  decision-making  network,  namely  the  ACCg,  ACCs  and  OFC. 
Our  study  capitalized  on  monkeys’  willingness  to  engage  with  a  social 
partner  via  an  interposed  computer  system  while  simultaneously  con¬ 
trolling  the  sensory  and  reward  environment.  We  specifically  matched 
choices  for  the  reward  outcomes  directly  received  by  the  actor 
monkey  (decision  maker)  and  controlled  for  potential  secondary 
acoustic  reinforcement  effects  associated  with  delivering  juice  to  the 
recipient  monkey.  In  these  conditions,  we  found  regional  biases  in 
the  encoding  of  social  decision  outcomes  with  respect  to  self  and 
another  individual.  In  this  network  of  received  (OFC)  and  foregone 
(ACCs)  reward  signals,  ACCg  emerged  as  an  important  nexus  for  the 
computation  of  shared  experience  and  social  reward. 

RESULTS 

Summary  of  behavior  in  the  reward-allocation  task 

On  half  of  the  trials,  termed  choice  trials,  actor  monkeys  chose 
between  visual  stimuli  that  led  to  juice  being  delivered  either  to 
themselves  (self  reward),  to  the  recipient  monkey  (other  reward)  or 
to  neither  monkey  (neither  reward).  Offers  appeared  in  pairs  of  three 
types,  which  defined  selfmeither  trials,  selfiother  trials  and  other: 
neither  trials  (Fig.  1).  On  the  other  half,  termed  cued  trials,  monkeys 
observed  a  single  cue  that  indicated  that  self,  other  or  neither  rewards 
would  be  delivered  by  the  computer. 

Actor  monkeys  performed  the  reward- alio  cation  task  well  (Fig.  2a), 
as  indicated  by  the  low  mean  number  of  incomplete  trials  per  session 
(4.6  ±  0.2%  (s.e.m.);  Online  Methods),  even  when  the  actors  had  no  chance 
of  obtaining  juice  rewards  themselves,  which  was  the  case  for  other: 
neither  choice  trials  and  for  other  and  neither  cued  trials  (7.4  ±  0.3%). 
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Figure  1  Reward-allocation  task,  (a)  Experimental  setup  for  an  actor  and 
a  recipient  monkey,  (b)  Stimulus-reward  outcome  mappings  for  reward 
delivered  to  actor  (self),  recipient  (other)  or  no  one  (neither),  shown 
separately  for  each  actor,  (c)  Magnitude  cue  used  to  indicate  juice  amount 
at  stake  for  each  trial  (see  d).  The  position  of  the  horizontal  bisecting  line 
specified  the  percentage  of  maximum  reward  that  was  possible,  (d)  Task 
structure  (see  Online  Methods).  Top  fork,  cued  trials;  bottom  fork,  choice 
trials.  Dashed  gray  lines  show  the  angle  of  the  actor’s  gaze,  converging 
on  the  fixation  point.  Eye  cartoons  indicate  times  at  which  the  actor 
could  look  around.  ITI,  inter-trial  interval;  MT,  movement  time; 

RT,  reaction  time. 
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Actor  monkeys  also  made  significantly  fewer  errors  when  they  made 
active  decisions  (choice  trials)  than  when  there  was  no  choice  (cued 
trials)  or  when  there  was  no  reward  at  stake  for  themselves  (P  <  0.0001, 
Welch  two-sample  t  test).  These  findings  suggest  that  monkeys  find 
it  rewarding  to  actively  choose  what  to  do  and  can  be  motivated  to 
work  without  direct  reinforcement. 

Reaction  times  often  serve  as  a  proxy  for  motivation  in  incentivized 
tasks29-33.  Reaction  times  for  making  different  choices  demonstrate 
that  actors  discriminated  the  reward  types  and  had  orderly  prefer¬ 
ences  amongst  them29’33.  Actors  were  fastest  to  choose  self  rewards, 
followed  by  other  rewards  and  neither  rewards  (Fig.  2b).  Self  versus 
other  reaction  times  differed  by  a  mean  of  39  ms  ( P  <  0.0001,  Welch 
two-sample  t  test);  other  versus  neither  reaction  times  differed  by  a 
mean  of  20  ms  ( P  <  0.0001).  The  ordered  reaction  times  by  monkeys 
making  choices  in  the  reward  allocation  task  suggest  that  rewarding 
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self  is  more  reinforcing  than  rewarding  the  recipient,  which  is  in  turn 
more  reinforcing  than  rewarding  no  one33. 

Finally,  actor  monkeys  shifted  gaze  to  the  recipients  more  frequently 
following  juice  delivery  to  them  than  after  juice  delivery  to  them¬ 
selves  or  to  neither  monkey,  consistent  with  greater  interest  in  the 
actions  of  the  other  monkey  when  he  was  rewarded  (Supplementary 
Fig.  1).  Taken  together,  these  observations  support  the  conclusion 
that  the  actor  monkeys  were  acutely  aware  of  the  difference  between 
self,  other  and  neither  reward  outcomes33. 

We  quantified  decision  preferences  by  calculating  a  contrast  ratio 
based  on  actors’  choices  (equation  (1),  Online  Methods).  Consistent 
with  our  previous  reports33,34,  actors  preferred  self  rewards  over 
other  or  neither  rewards,  but  preferred  other  over  neither  rewards 
(Fig.  2c).  On  selfmeither  and  self:other  trials,  actor  monkeys  almost 
always  chose  to  reward  self  (preference  index,  mean  +  s.e.m.:  self: 
neither,  -0.99  +  0.00;  self:other,  -0.99  ±  0.00;  significantly  different 
from  zero:  both  P  <  0.0001,  one  sample  t  test;  Fig.  2c).  In  contrast, 
on  othermeither  trials,  actors  preferred  to  allocate  rewards  to  the 
recipient  monkey  (0.17  ±  0.01,  P  <  0.0001,  one  sample  t  test;  Fig.  2c). 
We  observed  similar  choice  preferences  for  each  actor  individually 
(Supplementary  Fig.  2). 

We  previously  found  that  the  preference  to  allocate  reward  to  the 
other  monkey  is  enhanced  by  greater  familiarity  between  the  two 
animals  and  is  abolished  if  the  recipient  is  replaced  with  a  juice 
collection  bottle33.  We  also  observed  that  reward  withholding  is 
reduced  when  actor  monkeys  are  dominant  toward  recipients,  and 
that  the  variability  and  the  degree  of  preferences  often  depend  on  the 
identity  of  the  recipients33.  Furthermore,  we  found  that  actor  monkeys 
prefer  to  deliver  juice  to  themselves  than  to  both  themselves  and  the 
recipient  simultaneously,  perhaps  reflecting  the  competitive  nature  of 
simultaneously  drinking  juice,  a  resource  controlled  outside  of  experi¬ 
mental  sessions  to  motivate  performance  and  often  monopolized  by 


Figure  2  Behavior  in  the  reward-allocation  task,  (a)  Proportions  of 
incomplete  trials  (mean  ±  s.e.m.)  (see  Online  Methods)  during  the  reward- 
allocation  task,  (b)  Choice  reaction  times  (ms)  from  trials  in  which  rewards 
were  chosen  for  self,  other  or  neither  (mean  of  session  medians  ±  s.e.m.). 
(c)  Choice  preferences  (preference  index,  mean  ±  s.e.m.)  as  a  function  of 
reward  outcome  contrasts.  Data  points  next  to  each  bar  show  the  biases 
for  individual  sessions.  The  degree  of  preference  axis  on  the  right  shows 
the  range  of  preference  indices  in  ratio  terms,  (d)  Choice  preferences 
(mean  ±  s.e.m.)  as  a  function  of  reward  magnitude  on  219  single-unit 
sessions  collected  with  the  magnitude  cue. 
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Figure  3  Single  neurons  and 
population  responses  from 
ACCg.  (a)  Structural  magnetic 
resonance  image  from  actor 
MO,  with  example  electrode 
paths  for  ACCg,  ACCs  and 
OFC.  egg,  cingulate  gyrus; 
cgs,  cingulate  sulcus; 
lots,  lateral  orbitofrontal  sulcus; 
mid,  midline;  mofs,  medial 
orbitofrontal  sulcus;  ps,  principal 
sulcus,  (b)  Mean  responses 
(peri-stimulus  time  histograms, 

PSTHs)  and  spike  rasters  for  an 
other-reward  preferring  ACCg 
neuron  on  choice  trials  (upper, 

solid  traces)  and  cued  trials  (lower,  dashed  traces).  Data  are  aligned  to  choice/cue  offset  (left)  and  reward  onset  (right)  for  each  reward  outcome. 

Bar  histograms  on  right  show  mean  ±  s.e.m.  activity  from  the  two  epochs  (gray  regions).  Color  codes  for  PSTH  traces  and  histograms  are  shown  below. 

(c)  PSTHs  and  spike  rasters  for  a  self-reward  preferring  ACCg  neuron,  (d)  PSTHs  and  spike  rasters  for  a  shared  self  and  other  reward-preferring  ACCg 
neuron,  (e)  Normalized  choice/cue  epoch  and  reward  epoch  responses  for  81  ACCg  neurons.  Data  in  c-e  are  presented  as  in  b.  In  all  bar  histogram  insets, 
the  horizontal  lines  above  different  conditions  indicate  significance  differences  (black,  P<  0.05  by  paired  f  test;  green,  P<  0.05  by  bootstrap  test). 
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dominant  monkeys  living  in  pairs  with  subordinate  monkeys  in  their 
home  cages33  (M.L.P.,  unpublished  observation).  Finally,  exogenously 
increasing  oxytocin  levels  in  the  CNS  amplifies  actors’  preference  to 
allocate  reward  to  the  other  monkey  over  no  one34.  Taken  together, 
these  patterns  of  behavior  endorse  the  fundamentally  social  nature 
of  the  reward- alio  cation  task. 

We  also  found  that  preferences  scaled  with  the  magnitude  of  juice 
on  offer.  With  larger  amounts  of  juice  at  stake,  actors  became  more 
motivated  to  receive  rewards  (selfmeither  and  selfiother,  slope  signi¬ 
ficantly  different  from  zero:  both  P  <  0.001,  type  II  regression)  and 
to  allocate  rewards  to  the  other  monkey  over  no  one  (othermeither, 
P  <  0.05)  (Fig.  2d).  These  findings  suggest  that  both  direct  and 
vicarious  reinforcement  processes  that  motivate  social  decisions  are 
magnified  by  reward  magnitude25-27. 

Differential  encoding  of  social  decision  outcomes 

We  recorded  the  activity  of  single  neurons  in  ACCg  ( n  =  81),  ACCs 
(n  =  101)  and  OFC  ( n  =  85)  from  two  actor  monkeys  (Fig.  3a)  during 
the  reward- alio  cation  task,  and  analyzed  the  data  for  both  a  choice/cue 
epoch  and  a  reward  epoch  (Online  Methods;  data  for  individual  mon¬ 
keys  are  shown  in  Supplementary  Fig.  3).  Overall,  we  found  notable 


similarities  in  activity  and  functional  classes  across  the  choice  and 
reward  epochs  (Supplementary  Fig.  4).  We  examined  single- neuron 
and  population  responses  from  ACCg  (Fig.  3),  ACCs  and  OFC 
(Fig.  4),  followed  by  further  quantifications  in  each  region  (Fig.  5). 

ACCg  contained  neurons  selective  for  allocating  rewards  to  another 
individual,  receiving  rewards  or  both.  One  class  of  ACCg  neuron 
(Fig.  3b)  preferentially  responded  when  actors  chose  to  allocate 
reward  to  recipients.  On  choice  trials,  this  example  neuron  discharged 
more  strongly  when  the  actor  chose  other  rewards  (7.12  +  0.66  (mean 
and  s.e.m.),  spikes  per  s)  compared  with  self  rewards  on  either  self: 
neither  or  self:other  trials  (4.95  +  0.36  and  4.93  +  0.45  spikes  per  s, 
respectively;  both  P  <  0.01,  Welch  two  sample  t  test),  and  also  pre¬ 
ferred  other  rewards  over  neither  rewards  (4.44  ±  0.79  spikes  per  s, 
P  <  0.05).  This  neuron  did  not  differentiate  self  from  neither  rewards 
(P  =  0.97,  Welch  two  sample  t  test).  On  cued  trials,  this  neuron  only 
weakly  preferred  other  over  self  or  neither  rewards  (both  P  =  0.08, 
Welch  two  sample  t  test;  Fig.  3b). 

In  contrast,  another  class  of  ACCg  neuron  (example  neuron 
in  Fig.  3c)  responded  selectively  for  choosing  self  rewards.  The 
example  neuron  discharged  more  when  the  actor  chose  to  reward 
himself  on  selfmeither  and  self:other  trials  (4.77  +  0.38  and 
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Figure  4  Single  neurons  and  population 
responses  from  ACCs  and  OFC.  (a)  PSTHs  and 
spike  rasters  for  a  single  ACCs  neuron  preferring 
forgone  rewards.  Data  are  aligned  to  choice/cue 
offset  (left)  and  reward  onset  (right)  for  each 
reward  outcome.  Bar  histograms  on  right  show 
mean  ±  s.e.m.  activity  from  the  two  epochs 
(gray  regions),  (b)  PSTHs  and  spike  rasters 
for  a  single  OFC  neuron  preferring  self  reward, 
(c)  Normalized  reward  epoch  responses  of  101 
ACCs  neurons,  (d)  Normalized  choice/cue  epoch 
and  reward  epoch  responses  of  85  OFC  neurons. 
In  all  panels,  data  are  presented  as  in  Figure  3. 


5.70  ±  0.41  spikes  per  s,  respectively)  com¬ 
pared  with  choosing  other  and  neither 
rewards  (2.02  ±  0.32  and  1.60  +  0.39  spikes 
per  s,  respectively)  (all  P  <  0.0001,  Welch  two 
sample  t  test;  Fig.  3c).  Moreover,  it  showed 
stronger  responses  when  the  actor  monkey 
received  rewards  in  self-other  than  in  self: 
neither  context,  but  this  effect  did  not  reach 
statistical  significance  (P  =  0.10,  Welch  two 
sample  t  test).  On  cued  trials,  this  neuron  pre¬ 
ferred  self  over  other  or  neither  rewards  (both 
P  <  0.0001,  Welch  two  sample  t  test).  For  both 
choice  and  cued  trials,  the  response  did  not 
differentiate  other  and  neither  rewards  (both 
P  >  0.23,  Welch  two  sample  t  test). 

Finally,  a  third  class  of  ACCg  neuron 
(example  neuron  in  Fig.  3d)  responded 

equivalently  to  both  received  rewards  (selfmeither,  15.28  ±  0.70  spikes 
per  s;  self:other,  16.47  ±  0.81)  and  allocated  rewards  to  other  (15.81  + 
1.16  spikes  per  s)  (both  P  >  0.64,  Welch  two  sample  t  test),  but 
responded  significantly  less  to  neither  rewards  (10.17  +  1.23  spikes 
per  s,  other  versus  neither  and  self  versus  neither,  both  P  <  0.005). 
Similarly,  on  cued  trials,  this  neuron  preferred  other  over  neither 
rewards  (P  <  0.05,  Welch  two  sample  t  test),  but  did  not  differentiate 
between  self  and  other  rewards  (P  =  0.27). 

Notably,  the  fact  that  the  solenoid  valves  controlling  juice  delivery 
(including  one  for  neither  rewards  that  only  produced  clicks)  were 
placed  outside  the  experimental  room,  as  well  as  the  white  noise 
played  inside  the  room,  during  sessions  rules  out  a  simple  explana¬ 
tion  that  other  reward- specific  (Fig.  3b)  and  shared  self/other  reward 
responses  (Fig.  3d)  were  merely  sensory  responses  to  the  sounds  of 
the  reward -delivery  mechanism. 

To  contrast  population  coding  of  decision  and  reward  information 
in  various  conditions,  we  computed  a  normalized  activity  bias  between 
each  pair  of  outcomes,  expressed  as  a  proportional  modulation  in 
mean  firing  rates  normalized  by  baseline  firing  rate.  In  the  ACCg 
population,  the  mean  normalized  activity  bias  for  other  over  neither 
rewards  (other  versus  neither)  was  0.21  ±  0.10  (s.e.m.),  a  21%  differ¬ 
ence,  which  was  significant  (P  <  0.05,  paired  t  test;  Figs.  3e  and  5a). 
Similarly,  the  bias  for  self  (from  self:other)  over  neither  rewards  was 
0.20  ±  0.12  (P  =  0.09,  paired  t  test).  Notably,  the  population  showed 
equivalent  responses  for  self  rewards  (self:other)  and  other  rewards 
(0.01  ±  0.12,  P  =  0.96,  paired  t  test).  On  the  other  hand,  it  showed 
a  significant  bias  for  self  rewards  when  the  actors  were  presented 
with  a  choice  between  rewarding  themselves  and  recipients  compared 
with  when  the  actors  were  presented  with  a  choice  between  reward¬ 
ing  themselves  and  no  one  (self:other  versus  selfmeither,  0.17  +  0.08, 
P  <  0.05,  paired  t  test),  suggesting  that  ACCg  is  particularly  sensitive 
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to  a  reward  context  involving  an  option  to  reward  another  individual. 
Thus,  the  ACCg  population  showed  an  equivalent  preference  for  other 
and  self  rewards,  and  preferred  both  over  neither  rewards. 

On  cued  trials,  however,  a  notably  different  pattern  emerged.  The 
population  responded  strongly  to  self  rewards,  but  barely  responded 
to  other  rewards  (0.59  +  0.32,  P  =  0.07,  paired  t  test;  Fig.  3e). 
Furthermore,  the  population  responded  no  differently  to  other  and 
neither  rewards  (0.22  ±  0.14,  P  =  0.14,  paired  £  test). 

Taken  together,  these  results  indicate  that  ACCg,  as  a  population, 
encodes  both  giving  and  receiving  rewards.  At  the  population  level, 
neuronal  activity  selective  for  allocating  rewards  to  another  individual 
was  specific  to  active  decisions  (Fig.  3e),  similar  to  what  has  been 
reported  by  functional  magnetic  resonance  imaging  of  human  ventral 
striatum  during  voluntary  versus  forced  charitable  donations25.  The 
confluence  of  neurons  selectively  responsive  to  self,  other  and  both 
(self  and  other)  rewards  in  ACCg  suggests  that  this  area  contains 
the  information  necessary  to  mediate  the  vicarious  reinforcement 
processes  that  appear  to  motivate  actors  to  give  to  recipients. 

Figure  4a  shows  a  typical  ACCs  neuron  that  fired  more  strongly 
preceding  other  and  neither  rewards  than  self  rewards.  On  choice 
trials,  this  neuron  discharged  more  strongly  when  the  actor  mon¬ 
key  chose  not  to  reward  himself  (other  rewards,  19.64  +  2.15  spikes 
per  s;  neither  rewards,  18.19  ±  2.03)  compared  with  when  he  chose 
to  reward  himself  directly  (selfineither,  10.31  +  0.86  spikes  per  s; 
self:other,  9.79  ±0.81)  (all  P  <  0.001,  Welch  two  sample  t  test).  The 
example  neuron  responded  equivalently  to  self  rewards  in  self:other 
and  selfmeither  contexts  (P  =  0.66,  Welch  two  sample  t  test),  and 
responded  equivalently  to  other  and  neither  rewards  (P  =  0.62),  con¬ 
sistent  with  encoding  ‘foregone  rewards.  On  cued  trials,  this  neuron 
responded  equivalently  to  other  and  neither  rewards  (P  =  0.39,  Welch 
two  sample  t  test),  but  responded  less  to  self  rewards  (both  P  <  0.005), 
resembling  the  responses  to  active  decisions. 


246 


VOLUME  16  |  NUMBER  2  |  FEBRUARY  2013  NATURE  NEUROSCIENCE 


2013  Nature  America,  Inc.  All  rights  reserved. 


ARTICLES 


ACCg  (n  =  81) 


Self  reward  activity 
(self:other) 


Neither  reward  activity 
(other:neither) 


Self  reward  activity 
(self:neither) 


O  Cell  1,519 
•  Cell  1 ,503 
O  Cell  1 ,520 


(self:neither) 


ACCs  (n  =  101) 


Self  reward  activity 
(self:other) 


>  Cell  2,059 


(other:neither) 


(self:neither) 


(self:neither) 


OFC  (n  =  85) 


►  Cell  1,099 


Self  reward  activity 
(self:other) 


Neither  reward  activity 
(other:neither) 


Self  reward  activity 
(self:neither) 


Self  reward  activity 
(self:neither) 


Frames  of  reference  used 
by  significant  neurons 

■  Self-referenced 

■  Other-referenced 

■  Both-referenced 


Figure  5  Population  biases  for  self,  other  and  neither  rewards,  (a-c)  Scatter  plots  show  mean  normalized  reward  epoch  responses  (proportion  of 
modulation  relative  to  baseline)  of  individual  neurons  (from  left  to  right)  between  self  (self :other)  and  other  rewards,  between  other  and  neither  rewards, 
between  self  rewards  from  selfmeither  and  self :other  contexts,  and  between  self  (selfmeither)  and  neither  rewards  for  ACCg  (a),  ACCs  (b)  and  OFC  (c) 
populations.  Regression  lines  (type  II)  are  shown  in  red  (the  circled  data  points  are  excluded  from  the  regression).  Unity  lines  are  shown  in  black. 

The  example  neurons  from  Figures  3  and  4  are  indicated  on  the  scatter  plots,  (d)  Proportion  of  neurons  (out  of  significantly  classified  neurons)  from 
OFC,  ACCs  and  ACCg  using  self-referenced,  other-referenced  and  both-referenced  frames  to  represent  reward  outcomes.  Inset  shows  color  codes  used  in 
the  bar  graph.  Bars  indicate  significant  differences  in  proportions  (P<  0.05,  x2  test). 


Figure  4b  shows  a  typical  OFC  neuron  that  preferentially  encoded  juice 
rewards  received  by  the  actor.  On  choice  trials,  this  neuron  discharged 
substantially  more  for  self  rewards  than  for  the  alternatives  on  both 
selfmeither  and  self:other  trials.  Activity  for  self  rewards  did  not  differ 
between  the  two  self  reward  contexts  (7.00  ±  0.47  and  7.03  ±  0.46  spikes 
per  s,  respectively;  P  =  0.97,  Welch  two  sample  t  test),  but  it  exceeded  the 
cells  activity  for  other  and  neither  rewards  (3.06  ±  0.40  and  1.85  ±  0.42 
spikes  per  s,  respectively;  both  P  <  0.0001).  On  cued  trials,  this  neuron 
responded  most  strongly  to  self  rewards  than  to  both  other  and  neither 
rewards  (both  P  <  0.0001,  Welch  two  sample  t  test),  but  it  did  not  respond 
differently  between  other  and  neither  rewards  (P  =  0.25)  (Fig.  4b). 

The  ACCs  population  showed  a  strong  and  equivalent  response 
bias  for  foregone  rewards  (self  versus  other,  activity  bias  =  0.31  + 
0.07;  self  versus  neither,  activity  bias  =  0.25  ±  0.08,  both  P  <  0.005, 
paired  t  test;  Figs.  4c  and  5b).  The  population  did  not  differentiate 
other  from  neither  rewards  (0.06  ±  0.06,  P  =  0.31,  paired  f  test).  Unlike 
ACCg,  the  population  did  not  respond  differentially  to  selfiother  and 
selfmeither  contexts  (differed  by  0.003  ±  0.02,  P  =  0.90,  paired  t  test). 
We  found  similar  patterns  on  cued  trials:  responses  to  self  rewards 
were  substantially  reduced  compared  with  other  rewards  (0.19  ±  0.09, 
P  <  0.05,  paired  t  test)  and  neither  rewards  (0.18  ±  0.10,  P  <  0.08) 
(Fig.  4c).  These  results  indicate  that,  during  social  interactions,  ACCs 
neurons  predominantly  signal  foregone  rewards. 

The  OFC  population  predominantly  encoded  self  rewards  compared 
with  other  and  neither  rewards.  The  bias  for  self  over  other  rewards 


was  30%  (0.30  +  0.09,  P  <  0.005,  paired  t  test).  For  self  versus  neither 
rewards,  the  bias  was  also  significant  (0.17  +  0.08,  P  <  0.05,  paired 
t  test;  Figs.  4d  and  5c).  Population  activity  for  other  and  neither  rewards 
did  not  differ  (0.08  ±  0.06,  P  =  0.20,  paired  t  test;  Figs.  4d  and  5c). 
Unlike  ACCg,  the  population  did  not  respond  differentially  to 
self:other  and  selfmeither  contexts  (differed  by  0.06  ±  0.07,  P  =  0.39, 
paired  t  test).  On  cued  trials,  the  self  reward  bias  was  not  present 
compared  with  other  rewards  (0.19  ±  0.16,  P  =  0.24,  paired  t  test) 
and  was  only  weakly  present  over  neither  rewards  (0.26  +  0.15, 
P  <  0.08).  On  cued  trials,  the  population  did  not  distinguish  other 
rewards  from  neither  rewards  (P  =  0.33,  paired  t  test;  Fig.  4d).  These 
results  indicate  that  OFC  neurons  predominantly  encode  rewards 
received  by  the  actors  and  that  this  information  was  encoded  more 
faithfully  during  active  decision-making. 

Neuronal  reference  frames  for  social  decisions 

Neuroimaging  and  scalp-recording  studies  in  humans  can  only  study 
neuronal  activity  at  an  aggregate  level.  Our  single-unit  recording  data 
therefore  provide  a  unique  opportunity  to  quantify  the  frame  of  refer¬ 
ence  in  which  individual  neurons  in  ACCg,  ACCs  and  OFC  encode 
social  decisions.  To  do  this,  we  classified  cells  from  each  area  on 
the  basis  of  an  analysis  of  variance  (ANOVA)  of  neuronal  activity 
of  individual  neurons  with  reward  outcome  (self,  other  or  neither), 
trial  type  (choice  or  cued)  and  reward  magnitude  (small,  medium  or 
large)  as  factors  (Online  Methods).  Reward  epoch  responses  differed 
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significantly  (P  <  0.05)  for  a  large  number  of  neurons  from  all 
areas  in  a  manner  that  depended  on  reward  outcome  (ACCg,  57%; 
ACCs,  72%;  OFC,  57%),  trial  type  (ACCg,  36%;  ACCs,  52%;  OFC, 
45%)  and  reward  volume  (ACCg,  12%;  ACCs,  25%;  OFC,  24%) 
(Supplementary  Table  1).  Furthermore,  we  observed  marked  simi¬ 
larities  in  reward  outcome  coding  across  the  choice/cue  and  reward 
epochs  (Supplementary  Fig.  4). 

On  the  basis  of  the  statistical  significance  of  the  ANOVA  during 
the  choice/cue  and  reward  epochs,  we  identified  individual  neurons 
as  self-referenced  (modulation  referenced  to  self  rewards,  prefer¬ 
ring  either  self  or  foregone  rewards),  other-referenced  (modula¬ 
tion  referenced  to  other  rewards),  both-referenced  (modulation 
referenced  to  both  self  and  other  rewards,  but  not  neither  rewards) 
or  unclassified  (Online  Methods).  We  considered  the  proportion 
of  different  cell  types  among  the  classified  neurons  based  on  this 
scheme.  In  OFC,  80%  ( n  =  36  of  45  neurons)  were  self-referenced, 
whereas  only  9%  (4  of  45)  were  other-referenced  and  11%  (5  of  45) 
were  both-referenced  (both  P  <  0.0001,  %2  test;  Fig.  5d).  In  ACCs, 
72%  (51  of  71)  were  self-referenced,  whereas  only  14%  (10  of  71) 
were  other-referenced  and  14%  (10  of  71)  were  both-referenced 
(both  P  <  0.0001,  x2  test;  Fig-  5d).  In  contrast,  ACCg  contained 
similar  proportions  that  were  self-referenced  (38%,  12  of  32),  other- 
referenced  (31%,  10  of  32)  and  both-referenced  (31%,  10  of  32) 
(P  >  0.79,  x2  test;  Fig.  5d).  Notably,  ACCg  contained  a  significantly 
higher  proportion  of  neurons  (>60%)  that  were  sensitive  to  the 
reward  outcome  of  the  recipient  monkey  (other-referenced  and 
both-referenced)  than  either  OFC  or  ACCs  (both  P  <  0.005,  %2  test; 
Fig.  5d).  ACCg  also  contained  a  significantly  smaller  proportion  of 
self-referenced  neurons  than  either  OFC  or  ACCs  (both  P  <  0.005, 
X2  test).  Finally,  we  found  similar  results  when  we  repeated  the  ana¬ 
lysis  and  included  trial-by-trial  choice  reaction  times  as  covariates 
(Supplementary  Fig.  5). 

To  test  whether  different  neuronal  frames  of  reference  (self-,  other- 
and  both-referenced)  were  anatomically  segregated,  we  used  prin¬ 
cipal  component  analysis  on  recording  coordinates  to  identify  the 
major  axis  with  the  largest  dispersion  in  three-dimensional  space.  We 
then  projected  neurons  to  that  axis  to  test  differential  distributions 
in  individual  monkeys  separately  (Fig.  6).  We  did  not  observe  any 
systematic  anatomical  clustering  among  different  frames  of  reference; 


Figure  6  Anatomical  projections  of  recorded  locations  of  all  ACCg, 

ACCs  and  OFC  cells.  Recording  sites  were  transformed  from  chamber 
coordinates  into  interaural  coordinates.  The  interaural  coordinates  of 
individual  cells  from  both  monkeys  were  then  projected  onto  standard 
stereotaxic  maps  of  rhesus  monkeys50,  with  a  2-mm  interaural  spacing 
in  the  anterior-posterior  dimension.  Cells  are  shown  on  coronal  slices 
and  color-coded  for  the  types  of  frames  of  reference  used,  as  specified 
in  Supplementary  Table  1  (see  box).  The  lateral  view  of  the  brain  (inset) 
shows  the  locations  of  the  coronal  sections.  Cd,  caudate;  cgs,  cingulate 
sulcus;  lorb,  lateral  orbitofrontal  sulcus;  morb,  medial  orbitofrontal 
sulcus;  ps,  principal  sulcus;  ros,  rostral  sulcus. 


self-,  other-  and  both-referenced  neurons  in  ACCg,  ACCs  and  OFC 
were  intermingled  (all  P  >  0.56,  Wilcoxon  rank  sum  test). 

Next,  we  examined  whether  differential  encoding  of  self,  other 
and  neither  rewards  was  also  present  before  making  a  decision. 
We  found  very  little  evidence  for  systematic  signals  early  in  the 
trial  just  after  target  onset  (50-250  ms  after  target  onset).  In  ACCg, 
only  zero,  three  and  one  cells  were  classified  into  self-,  other-  and 
both-referenced  classes,  with  only  12%  of  neurons  showing  significant 
effect  of  reward  type.  In  ACCs,  only  one,  two  and  three  cells  belonged 
to  each  category,  with  only  22%  of  the  neurons  showing  significant 
reward  type  effects.  Similarly,  in  OFC,  only  two,  two  and  four  cells 
belonged  to  each  category,  with  only  28%  of  the  neurons  showing 
significant  reward  type  effects.  Thus,  in  our  reward  allocation  task, 
signals  in  ACCg,  ACCs  and  OFC  appear  to  emerge  around  the  time 
of  choice  and  reward  delivery. 

When  we  examined  the  reward  magnitude  sensitivities  of  indi¬ 
vidual  neurons,  we  found  the  population  in  ACCs  to  be  most  sensi¬ 
tive  (Supplementary  Figs.  6  and  7).  Furthermore,  signal-to-noise 
in  neuronal  responses  to  specific  reward  outcomes  were  largely  con¬ 
sistent  with  the  preferred  neuronal  encoding  scheme  in  each  region 
(Supplementary  Fig.  8).  None  of  our  findings  were  driven  by  whether 
or  not  actors  looked  at  recipients  (Supplementary  Fig.  9). 

Finally,  we  examined  whether  session-to-session  variation  in 
prosocial  tendencies  on  othermeither  trials  (Fig.  2c)  could  be 
explained  by  variability  in  the  responses  of  ACCg  neurons,  the  popu¬ 
lation  most  sensitive  to  others  rewards.  We  split  recording  sessions 
on  the  basis  of  actors’  choices  on  othermeither  into  two  categories: 
more  prosocial  (higher  other  over  neither  choices  relative  to  the 
median  preference  index)  and  less  prosocial  (lower  other  over  nei¬ 
ther  choices  relative  to  the  median  preference  index).  Actors  tended 
to  be  more  prosocial  on  recording  sessions  when  other-referenced 
and  both-referenced  ACCg  neurons  showed  less  variability  in  spik¬ 
ing  during  the  reward  epoch  (P  <  0.05,  bootstrap  test;  Fig.  7a). 
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Figure  7  Prosocial  behavior  and  the  fidelity  of  neuronal  responses  on 
othenneither  trials,  (a)  ACCg.  (b)  ACCs,  (c)  OFC.  Coefficients  of  variation 
in  firing  rate  (CV;  Online  Methods)  during  the  reward  epoch  on  other 
reward  trials  are  plotted  as  a  function  of  whether  actors  were  more  or 
less  prosocial  on  othenneither  trials  on  the  basis  of  median  split  (higher: 
preference  index  greater  than  median;  lower:  preference  index  less  than 
median).  *P<  0.05,  bootstrap  test. 
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In  contrast,  we  found  that  self-referenced  ACCg  neurons  gener¬ 
ated  more  variable  responses  during  the  reward  epoch  in  which 
actors  were  more  prosocial  (P  <  0.05,  bootstrap  test).  ACCs  neurons 
did  not  show  any  systematic  relationship  between  response  vari¬ 
ance  and  behavior  (P  =  0.47,  bootstrap  test;  Fig.  7b).  Notably,  OFC 
neurons  showed  a  similar  pattern  as  self- referenced  ACCg  neurons 
(P  <  0.005,  bootstrap  test;  Fig.  7c).  These  findings  suggest  a  strong 
link  between  prosocial  behavior  and  the  fidelity  of  social  reward 
signals  carried  by  those  neurons  that  incorporate  the  experience 
of  others  into  their  responses.  This  could  be  a  result  of  enhanced 
attention  to  the  recipient  or  other  processes  known  to  influence 
signal-to-noise  in  cortical  neurons. 


DISCUSSION 

Our  findings  strongly  endorse  the  hypothesis  that  distinct  fron¬ 
tal  regions  contribute  uniquely  to  social  decisions  by  differentially 
processing  decision  outcomes  with  respect  to  actors  (self)  and  their 
partners  (other).  The  finding  that  OFC  neurons  selectively  encode 
self  reward  is  consistent  with  previous  results  implicating  this  area 
in  representing  the  subjective  value  of  rewards12’13,  but  extend 
those  results  by  demonstrating  that  such  value  signals  are  encoded 
egocentrically.  Encoding  of  foregone  rewards  by  ACCs  neurons,  on 
the  other  hand,  is  consistent  with  previous  data  implicating  this  area 
in  error  monitoring  and  behavioral  adjustment35-37.  For  example, 
foregone  reward  signaling  by  ACCs  might  be  used  to  learn  from 
observation,  rather  than  direct  experience,  and  adjust  ongoing 
behavior  during  social  interactions.  Furthermore,  mirroring  of  self 
and  other  rewards  by  ACCg  neurons  is  consistent  with  previous 
studies  linking  this  area  to  specifically  social  functions,  such  as  shared 
experience  and  empathy38. 

Our  findings  are  consistent  with  those  of  a  previous  study  examin¬ 
ing  the  effects  of  lesions  in  these  same  brain  regions  (Online  Methods), 
which  found  that  ACCg,  but  not  OFC  or  ACCs,  contributes  causally 
to  the  use  of  visual  social  information  to  guide  behavior9.  Specifically, 
ACCg  lesions  completely  abolished  typical  hesitation  to  retrieve  food 
when  confronted  with  social  stimuli9.  Our  findings  also  agree  with 
previous  findings  that  lesions  in  ACCs  impair  the  use  of  reward 
history  to  guide  decisions  adaptively10.  The  differences  between 
ACCs  and  ACCg  that  we  observed  support  and  extend  the  finding 
that  learning  from  experience  is  mediated  by  ACCs,  whereas  learn¬ 
ing  from  feedback  from  another  individual  is  mediated  by  ACCg8. 
Specifically,  in  a  learning  task  in  which  human  subjects  monitored 
their  history  of  correct  responses  as  well  as  the  advice  given  to  them 
by  a  confederate,  blood  oxygen  level-dependent  (BOLD)  activation 
in  ACCs  tracked  reward  learning  rate,  whereas  BOLD  activation  in 
ACCg  tracked  social  learning  rate  based  on  advice  from  the  confeder¬ 
ate8.  In  our  study,  we  propose  that  ACCs  tracked  foregone  rewards 
relative  to  self,  whereas  ACCg  tracked  reward  outcomes  of  another 
individual  in  a  more  complex  manner. 

Notably,  the  ACCg  population  also  responded  more  strongly  when 
monkeys  chose  self  reward  when  the  alternative  was  allocating  reward 
to  the  other  monkey  compared  with  the  response  when  monkeys 
chose  self  reward  when  the  alternative  was  rewarding  no  one.  In  con¬ 
trast,  neither  the  OFC  neuronal  population  response  nor  the  ACCs 
neuronal  population  response  was  sensitive  to  social  context  when 
monkeys  rewarded  themselves.  Sensitivity  to  social  context  in  ACCg 
endorses  a  specialized  role  for  this  area  in  computing  social  decisions, 
even  when  one  acts  selfishly. 

It  is  worthwhile  to  note  that  a  small  number  of  ACCs  and  OFC 
neurons,  although  much  less  in  proportion  compared  with  ACCg 

(Fig.  5d,  Supplementary  Table  1  and  Supplementary  Fig.  5),  were 


classified  as  either  other-  or  both-referenced.  This  observation  sup¬ 
ports  the  idea  that  a  small  number  of  ACCs  and  OFC  neurons  do  carry 
information  about  rewards  allocated  to  another  individual.  What  is 
notable  is  that  the  majority  of  OFC  and  ACCs  neurons  (80%  and 
72%,  respectively)  did  not  carry  such  other-regarding  information 
(other-  or  both-referenced),  whereas  the  majority  of  ACCg  neurons 
did  (62%).  This  endorses  a  fundamentally  social  role  for  neurons 
in  ACCg. 

A  prior  study  showed  that  OFC  neurons  modulate  their  activ¬ 
ity  when  a  monkey  receives  juice  reward  together  with  another 
individual39,  suggesting  that  value  signals  in  OFC  are  sensitive  to 
social  context.  In  that  study,  OFC  neurons  responded  differentially 
as  a  function  of  whether  the  subject  monkey  received  juice  rewards 
alone  or  together  with  another  monkey39.  Our  current  study  builds 
on  and  extends  those  findings  in  three  important  ways.  First,  we 
used  a  free- choice  task  that  allowed  us  to  infer  the  subjective  value 
of  rewards  delivered  to  self,  other  and  no  one.  Notably,  even  in  a 
social  context,  OFC  neurons  were  selective  for  self  reward,  the  most 
preferred  outcome.  Second,  we  compared  the  responses  of  OFC  neu¬ 
rons  to  responses  of  neurons  in  ACCg  and  ACCs  recorded  in  identi¬ 
cal  task  conditions,  allowing  us  to  examine  regional  differences  in 
the  encoding  of  social  reward  information  in  primate  frontal  cortex. 
Third,  when  we  compared  responses  of  ACCg  neurons  on  free-choice 
and  cued  trials,  we  found  that  responses  to  rewards  delivered  to  the 
recipient  monkey  were  largely  absent  when  actors  passively  observed 
the  event  rather  than  actively  choosing  it.  Taken  together,  these  find¬ 
ings  indicate  that  social  context  can  affect  the  encoding  of  reward 
information  in  all  three  areas;  OFC  appears  to  evaluate  personally 
experienced  rewards,  ACCs  evaluates  reward  information  that  is  not 
directly  experienced,  and  ACCg  multiplexes  information  about  the 
direct  experience  of  reward  and  vicarious  reinforcement  experienced 
by  allocating  reward  to  another  individual. 

It  is  noteworthy  that  ACCs  neurons  showed  much  less  modulation 
by  actors’  received  reward  outcomes  compared  with  OFC  neurons,  as 
ACCs  neurons  often  show  substantial  modulation  to  received  reward 
in  nonsocial  settings11.  ACCg,  on  the  other  hand,  contains  neurons 
that  compute  reward  signals  in  both  other  and  self  frames  of  reference. 
Together,  our  findings  suggest  that,  as  in  sensory  and  motor  systems40, 
identifying  the  frames  of  reference  in  which  reward  outcomes  are 
encoded  maybe  important  for  understanding  the  neural  mechanisms 
underlying  social  decision-making8. 

Accumulating  evidence  endorses  a  special  role  for  the  medial- 
frontal  cortex  in  representing  information  about  another  individ¬ 
ual8,41-44.  For  instance,  perceived  similarity  while  observing  others 
is  correlated  with  hemodynamic  response  in  the  subgenual  ACC44. 
Furthermore,  a  group  of  neurons  in  the  primate  medial-frontal  cortex 
selectively  responds  to  observing  actions  performed  by  other  indivi¬ 
duals41.  Such  other-referenced  signals,  however,  are  not  limited  to 
the  medial  wall  of  the  frontal  cortex.  Neurons  in  the  dorsolateral 
prefrontal  cortex  (DLPFC)  track  the  behavior  of  a  computer  oppo¬ 
nent  in  an  interactive  game45,  and  BOLD  responses  in  DLPFC  and 
ventromedial  prefrontal  cortex  during  observational  learning  track 
observed  action  and  observed  reward  prediction  errors,  respec¬ 
tively46.  In  addition,  BOLD  activity  in  anterior  frontal  areas  tracks 
preferences  to  donate  to  charity24.  Brain  networks  involved  in  men- 
talizing47,  vicarious  pain  perception48  and  empathy49  therefore  seem 
to  be  critical  for  mediating  social  interactions,  suggesting  that  other- 
regarding  cognition  is  orchestrated  by  a  distributed  network  of  frontal 
cortical  areas. 

Social  and  emotional  behaviors  are  highly  idiosyncratic  among 
individuals.  Understanding  the  neural  mechanisms  that  drive  such 
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individual  differences  remains  one  of  the  most  pressing  issues  in 
neuroscience.  We  hypothesize  that  the  differential  activation  of 
neurons  in  ACCg,  ACCs  and  OFC  contribute  to  individual  and, 
perhaps,  species  differences  in  social  function. 

METHODS 

Methods  and  any  associated  references  are  available  in  the  online 
version  of  the  paper. 

Note:  Supplementary  information  is  available  in  the  online  version  of  the  paper. 
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ONLINE  METHODS 

General  and  behavioral  procedures.  All  procedures  were  approved  by  the  Duke 
University  Institutional  Animal  Care  and  Use  Committee,  and  were  conducted 
in  compliance  with  the  Public  Health  Service’s  Guide  for  the  Care  and  Use  of 
Laboratory  Animals. 

Two  actor  (MY  and  MO)  and  five  recipient  monkeys  ( Macaca  mulatto)  par¬ 
ticipated.  For  all  monkeys,  a  sterile  surgery  was  performed  to  implant  a  head- 
restraint  prosthesis  (Crist  Instruments)  using  standard  techniques11.  Six  weeks 
after  surgery,  monkeys  were  trained  on  a  standard,  center-out,  oculomotor  task 
for  liquid  rewards.  Actor  monkeys  were  then  trained  on  the  reward-allocation 
task  (Fig.  1)  in  the  presence  of  a  recipient.  Subsequently,  a  second  surgery  was 
performed  on  actors  to  implant  a  recording  chamber  (Crist)  providing  access 
to  the  ACCs,  ACCg  and  OFC.  All  surgeries  were  performed  under  isoflurane 
anesthesia  (1-3%,  vol/vol),  and  the  recording  chambers  were  regularly  cleaned, 
treated  with  antibiotics  and  sealed  with  sterile  caps. 

Horizontal  and  vertical  eye  positions  were  sampled  at  1,000  Hz  using  an  infra¬ 
red  eye  monitor  camera  system  (SR  Research  Eyelink).  Stimuli  were  controlled 
by  PsychToolBox  and  Matlab  (Math Works).  Actors  and  recipients  sat  in  primate 
chairs  (Crist),  100  cm  from  one  another  at  a  45°  angle  (Fig.  la).  Actors  (both 
males)  and  recipients  (four  males,  one  female)  were  unrelated  and  were  not  cage- 
mates.  Different  pairs  were  selected  depending  on  the  availability  of  recipient 
monkeys.  Actors  were  housed  in  a  colony  with  12  other  male  rhesus  macaques, 
some  of  which  were  pair-housed.  All  of  the  male  monkeys  resided  in  this  colony 
room,  and  the  one  female  monkey  resided  in  the  adjacent  colony  room  with 
other  females.  Of  the  total  seven  actor-recipient  pairs  that  we  tested,  the  actor 
monkey  was  dominant  over  the  recipient  in  six  cases.  Furthermore,  three  pairs 
could  be  classified  as  ‘more  familiar’  with  one  another  because  their  cages  faced 
each  other,  as  defined  previously33.  Based  on  these  relationships,  we  would  expect 
a  mixture  of  prosocial  and  competitive  preferences,  as  we  previously  found  that 
dominant  actors  are  slightly  less  competitive  than  subordinates,  but  pairs  in  which 
the  actor  is  less  familiar  with  the  recipient  are  slightly  less  prosocial  than  when 
they  are  more  familiar. 

In  the  experimental  setup,  each  monkey  had  his  own  monitor,  which  displayed 
identical  visual  stimuli.  Both  the  actor  and  recipient  monkeys  had  their  own 
tube  from  which  juice  drops  were  delivered.  To  prevent  monkeys  from  forming 
secondary  associations  of  solenoid  valve  clicks  or  the  sound  of  the  recipient  drink¬ 
ing  the  juice  reward  with  respect  to  different  reward  types,  the  solenoid  valves 
that  delivered  the  juice  rewards  were  placed  in  another  room  and  white  noise 
was  also  played  in  the  background.  Experimenters  were  unable  to  hear  solenoids 
©  anywhere  inside  the  recording  room.  Our  control  of  the  acoustic  environment 
explicitly  rules  out  a  simple  explanation  that  both -referenced  reward  encoding 
found  in  ACCg  is  a  product  of  such  secondary  sensory  associations.  Critically, 
a  separate  solenoid  (also  placed  in  another  room)  was  designated  for  neither 
rewards;  it  produced  clicks,  but  delivered  no  fluid. 

The  face  region  of  the  recipient,  with  respect  to  the  gaze  angle  of  the  actor 
(horizontal  and  vertical  eye  positions),  was  determined  empirically  before  the 
experiments.  The  frequency  with  which  actors  looked  at  recipients  was  computed 
from  number  of  gaze  shifts  to  the  recipient’s  face  (±8.5°  from  the  center  of  the 
face)33,34.  We  used  a  large  window  to  capture  gaze  shifts  that  were  brief  in  dura¬ 
tion  and  large  in  magnitude  and  often  directed  at  varying  depths  (for  example, 
eyes  and  mouth;  Fig.  la). 

Monkeys  performed  the  task  to  obtain  drops  of  cherry-  or  orange-flavored 
juice.  Actors  began  a  trial  by  shifting  gaze  (±2.5°)  to  a  central  stimulus 
(0.5°  x  0.5°),  and  maintained  fixation  (200  ms).  For  219  single-unit  sessions, 
the  reward  magnitude  at  stake  (0. 1-2.4  ml)  on  each  trial  was  cued  by  the 
position  of  a  horizontal  bisecting  line  (200  ms),  indicating  the  percentage  of 
the  maximum  possible  volume.  There  were  two  kinds  of  trials,  termed  choice 
trials  and  cued  trials.  Following  a  variable  delay  (300,  500  and  700  ms),  choice 
and  cued  trials  were  presented  at  equal  probabilities,  randomly  interleaved. 
On  choice  trials,  two  visual  targets  (4°  x  4°)  appeared  at  two  random  locations 
7°  eccentric  in  the  opposite  hemifield.  Actors  shifted  gaze  to  one  target  (±2.5°) 
to  indicate  a  choice  in  the  maximum  allowed  time  of  1.5  s  (from  stimulus 
onset).  The  pair  of  stimuli  appearing  on  a  given  trial  was  drawn  from  the  set 
of  three  stimuli  (Fig.  lb),  pseudorandomly  selected.  On  cued  trials,  actors 
maintained  fixation  (±2.5°)  while  a  cue  (4°  x  4°)  appeared  centrally  (500  ms). 
Cues  indicating  rewards  for  the  actor,  recipient  or  neither  monkey  occurred 
with  equal  frequency,  pseudorandomly  determined  (Fig.  lb).  Reward  onset 


was  followed  by  a  0-900-ms  delay  from  the  time  of  either  making  a  choice 
or  cue  offset.  Actors  were  free  to  look  around  during  this  delay  and  for  1  s 
after  reward  delivery.  Reward  delivery  was  followed  by  an  intertrial  interval 
of  700,  1,000  or  1,300  ms.  After  making  an  error  (see  below),  both  monkeys 
received  visual  feedback  (a  white  rectangle,  10°  x  10°)  followed  by  a  5-s  time 
out  before  the  next  trial. 

Recording  procedures.  All  recordings  were  made  using  tungsten  electrodes 
(FHC).  Single  electrodes  were  lowered  using  a  hydraulic  microdrive  system 
(Kopf  Instruments  or  FHC).  Single-unit  waveforms  were  isolated  and  action 
potentials  were  collected  using  a  16-channel  recording  system  (Plexon). 

To  guide  the  placement  of  recording  tracks  and  localize  recording  sites,  we 
acquired  structural  magnetic  resonance  images  (MRI;  3T,  1-mm  slices)  of  each 
actor’s  brain.  Detailed  localizations  were  made  using  Osirix  viewer.  In  addition 
to  MRI  guidance,  we  confirmed  that  electrodes  were  in  ACCg,  ACCs  or  OFC  by 
listening  to  gray  matter-  and  white  matter- associated  sounds  while  lowering  the 
electrodes.  ACCg  neurons  were  recorded  from  Brodmann  areas  24a,  24b  and  32, 
ACCs  neurons  (dorsal  and  ventral  banks)  were  recorded  from  24c  and  24c’,  and 
OFC  neurons  were  recorded  from  13m  and  11  (based  on  standard  anatomical 
references51’52;  Figs.  3a  and  6). 

Single-unit  recordings  were  made  from  two  actor  monkeys  while  each  was 
engaged  in  a  reward-allocation  task  with  a  recipient  monkey  in  267  sessions. 
A  total  of  81  ACCg  neurons  (MY,  45;  MO,  36),  101  ACCs  neurons  (MY,  39; 
MO,  62)  and  85  OFC  neurons  (MY,  46;  MO,  39)  were  included  in  the  study. 
Neurons  were  selected  for  recording  based  solely  on  the  quality  of  isolation.  For 
a  small  subset  of  the  data  (18%;  ACCg,  0%;  ACCs,  25%;  OFC,  27%),  data  were 
collected  in  a  task  with  a  fixed  reward  size  (typically  1.0  ml  per  successful  trial; 
identical  to  Fig.  Id  except  without  the  magnitude  cue).  For  the  majority  of  the 
cells  (82%,  n  =  219),  data  were  either  collected  in  a  task  with  the  magnitude  cue 
(ACCg,  100%,  n  =  81;  ACCs,  60%,  n  =  61;  OFC,  42%,  n  =  36;  Fig.  Id)  or  both  with 
and  without  the  magnitude  cue  (that  is,  two  or  more  consecutive  blocks  per  cell; 
ACCg,  0%;  ACCs,  15%,  n  =  15;  OFC,  31%,  n  =  26).  We  combined  the  two  types 
of  data  in  our  analyses  unless  otherwise  specified. 

Data  from  each  cell  consisted  of  firing  rates  during  440  ±13  (±217)  (median 
±  s.e.m.  (±s.d.))  trials.  A  trial  was  considered  incomplete  if  the  monkey  failed  to 
choose  a  target  on  choice  trials  (choice-avoidance  error)  or  to  maintain  fixation 
after  cue  onset  on  cued  trials  (forced- choice  avoidance  error).  Such  trials  were 
not  included  in  the  neural  analysis.  The  monkeys  performed  the  task  well,  as 
evidenced  by  a  high  percentage  of  correct  trials  even  on  trials  in  which  they  did 
not  receive  juice  reinforcement  (Fig.  2a). 

Data  analysis.  Choice  preference  indices  were  constructed  as  contrast 
ratios33’34. 


Preference  Index  =  — — —  (1) 

rA+rb 

Ra  and  Rb  were  the  frequency  of  making  particular  choices.  For  self:other  tri¬ 
als,  Ra  and  Rb  were  number  of  choices  to  reward  other  and  self,  respectively. 
For  othermeither  trials,  RA  and  RB  were  number  of  choices  to  reward  other  and 
neither,  respectively.  Finally,  for  self:neither  trials,  RA  and  RB  were  number  of 
choices  to  reward  neither  and  self,  respectively.  Indices  therefore  ranged  from 
-1  to  1,  with  1  corresponding  to  always  choosing  to  allocate  reward  to  other 
on  othermeither  trials  and  self: other  trials,  and  always  choosing  not  to  reward 
self  on  selfmeither  trials.  An  index  of  -1  corresponds  to  the  opposite,  gener¬ 
ally  stated  as  choosing  not  to  allocate  reward  to  the  other  monkey  or  choosing 
to  reward  oneself.  Values  of  0  indicate  indifference.  For  constructing  neuronal 
preferences,  we  simply  substituted  the  choice  frequency  with  neuronal  firing 
rates  associated  with  making  specific  decisions.  Response  times,  the  time  from 
the  onset  of  choices  to  movement  onset,  were  computed  using  a  20°  s_1  velocity 
threshold  criterion33’34. 

Spike  rates  were  computed  during  the  reward  epoch  (50-600  ms  from  reward 
onset)  as  well  as  the  choice/cue  epoch  (-100-300  ms  from  making  a  choice  or 
cue  offset).  For  the  population  analyses,  we  normalized  reward  firing  rates  to 
the  average  baseline  rates  for  each  reward  outcome  (300-ms  interval  before 
fixation  onset).  Using  marginally  different  time  windows  and  different  normali¬ 
zation  methods  all  resulted  in  similar  conclusions.  Coefficients  of  variation  were 
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calculated  for  each  neuron  on  the  basis  of  the  s.d.  (o)  and  mean  (fj)  using  the 
spike  rates  (spikes  per  s)  from  the  reward  epoch: 

CV  =  ^  (2) 

A* 

In  OFC  and  ACCs  populations,  the  two  self  rewards  (that  is,  self  rewards  chosen 
from  selfmeither  and  self:other  trials)  were  largely  indifferent  (Figs.  4  and  5b, c), 
and  we  combined  them  by  taking  means  for  the  coefficient  of  variation  analysis. 
In  contrast,  the  population  of  ACCg  neurons  responded  more  strongly  to  self 
rewards  obtained  from  a  social  context  (self:other)  compared  with  when  there 
was  no  reward  stake  for  the  other  monkey  (selfmeither);  thus,  we  considered  the 
two  self  rewards  separately  in  ACCg  (see  Figs.  3  and  5a). 

ANOVA  was  used  to  classify  the  reward  response  selectivity  of  individual 
neurons  from  each  area  and  performed  per  individual  cells.  Two-factor  ANOVA 
was  used  to  classify  the  selectivity  of  reward  outcome  (self,  other  or  neither)  and 
trial  type  (choice  or  cued)  for  all  neurons.  Three-factor  ANOVA  was  used  to  clas¬ 
sify  the  selectivity  of  reward  volume  (binned  into  small,  medium,  large)  for  the 
82%  of  cells  from  all  areas  that  were  collected  in  the  task  with  a  magnitude  cue. 
Statistical  significance  for  each  reward  type  was  computed  by  Tukey  HSD  test. 
Finally,  we  excluded  three  OFC  cells  when  our  analyses  involved  using  the  data 
from  neither  rewards  because  these  cells  were  recorded  on  very  rare  sessions  in 
which  the  monkeys  either  never  chose  the  neither  reward  option  or  did  so  fewer 
than  four  times.  Across  all  analyses,  using  slightly  different  epoch  durations  for 
neuronal  data  analyses  led  to  similar  results. 

Classification  of  cell  types  by  significant  reward  specificity.  Based  on  Tukey 
HSD  tests  from  the  one-way  ANOVA  on  reward  outcome  (self,  other,  or  neither) 
for  both  the  choice/cue  epoch  and  reward  epoch  responses,  we  classified  cells 
into  the  following  categories:  self-referenced,  other-referenced,  both-referenced 
and  unclassified.  These  categories  do  not  imply  functional  roles,  but  indicate 


that  firing  rates  were  significantly  different  based  on  reward  outcomes.  We  refer 
to  a  neuron  as  self- referenced  if  the  responses  of  the  neuron  were  significantly 
different  ( P  <  0.05)  between  self  and  other  rewards  as  well  as  between  self  and 
neither  rewards,  but  not  different  between  other  and  neither  rewards.  We  refer 
to  a  neuron  as  other-referenced  if  the  responses  of  the  neuron  showed  significant 
differences  in  firing  rates  between  self  and  other  rewards  as  well  as  between  other 
and  neither  rewards,  but  not  different  between  self  and  neither  rewards.  Finally, 
we  refer  to  a  neuron  as  both-referenced  if  the  responses  of  the  neuron  showed 
significant  differences  in  responses  between  self  and  neither  rewards  as  well  as 
other  and  neither  rewards,  but  not  different  between  self  and  other  rewards. 
Neurons  that  did  not  fall  into  one  of  these  categories  were  considered  as  unclas¬ 
sified.  Applying  slightly  different  criteria  or  differently  configured  ANOVA  did 
not  change  the  overall  proportional  trends  of  these  classes. 

Reward  magnitude  analysis.  We  examined  reward  magnitude  modulation  in  219 
neurons  (that  is,  82%  of  all  neurons  collected  with  the  magnitude  cue;  81  ACCg, 
76  ACCs  and  62  OFC  neurons).  We  performed  a  linear  regression  on  the  activity 
(spikes  per  s)  of  individual  neurons  across  unbinned  reward  sizes.  We  fit  the 
data  using  the  reward  epoch  activity  separately  for  self,  other  and  neither  reward 
outcomes  and  obtained  fitted  slopes  (that  is,  reward  magnitude  sensitivity  in 
spikes  per  s  per  ml)  for  each  reward  outcome.  For  examining  the  relationship 
between  the  reward  magnitude  sensitivity  across  actors’  received  and  foregone 
reward  outcomes,  we  compared  the  average  signed  slopes  from  all  received 
rewards  (self  rewards  on  choice  and  cued  trials)  and  all  foregone  rewards  (other 
and  neither  reward  on  choice  and  cued  trials)  in  individual  neurons. 

51.  Vogt,  B.A.  &  Pandya,  D.N.  Cingulate  cortex  of  the  rhesus  monkey.  II.  Cortical 
afferents.  J.  Comp.  Neurol.  262,  271-289  (1987). 

52.  Carmichael,  S.T.  &  Price,  J.L.  Limbic  connections  of  the  orbital  and  medial 
prefrontal  cortex  in  macaque  monkeys.  J.  Comp.  Neurol.  363,  615-641  (1995). 
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People  attend  not  only  to  their  own  experiences,  but  also  to  the 
experiences  of  those  around  them.  Such  social  awareness  profoundly 
influences  human  behavior  by  enabling  observational  learning,  as 
well  as  by  motivating  cooperation,  charity,  empathy,  and  spite. 
Oxytocin  (OT),  a  neurosecretory  hormone  synthesized  by  hypotha¬ 
lamic  neurons  in  the  mammalian  brain,  can  enhance  affiliation  or 
boost  exclusion  in  different  species  in  distinct  contexts,  belying  any 
simple  mechanistic  neural  model.  Here  we  show  that  inhaled  OT 
penetrates  the  CNS  and  subsequently  enhances  the  sensitivity  of 
rhesus  macaques  to  rewards  occurring  to  others  as  well  as  them¬ 
selves.  Roughly  2  h  after  inhaling  OT,  monkeys  increased  the 
frequency  of  prosocial  choices  associated  with  reward  to  another 
monkey  when  the  alternative  was  to  reward  no  one.  OT  also 
increased  attention  to  the  recipient  monkey  as  well  as  the  time  it 
took  to  render  such  a  decision.  In  contrast,  within  the  first  2  h 
following  inhalation,  OT  increased  selfish  choices  associated  with 
delivery  of  reward  to  self  over  a  reward  to  the  other  monkey, 
without  affecting  attention  or  decision  latency.  Despite  the  differ¬ 
ences  in  species  typical  social  behavior,  exogenous,  inhaled  OT 
causally  promotes  social  donation  behavior  in  rhesus  monkeys,  as 
it  does  in  more  egalitarian  and  monogamous  ones,  like  prairie  voles 
and  humans,  when  there  is  no  perceived  cost  to  self.  These  findings 
potentially  implicate  shared  neural  mechanisms. 

social  decision-making  |  neuropeptide  |  other-regarding  preference  | 
social  gaze 

Oxytocin  (OT)  (1)  is  a  mammalian  neurosecretory  hormone, 
synthesized  by  hypothalamic  neurons,  which  regulates  the 
hypothalamic-pituitary-adrenal  axis  (2).  The  most  well-under¬ 
stood  role  of  OT  in  mammals  is  in  female  reproduction,  with 
peripheral  OT  influencing  parturition  and  lactation  (3),  and 
central  OT  affecting  mother-offspring  bonding  and  recognition 
(4,  5).  More  recently,  OT  has  been  found  to  influence  non- 
parental  social  behavior  in  a  species-specific  manner.  For  ex¬ 
ample,  OT  promotes  pair-bonding  between  males  and  females  in 
monogamous  prairie  voles  ( Microtus  ochrogaster)  (6,  7)  but  can 
also  increase  aggression  (i.e.,  mate-guarding  behavior)  and  de¬ 
crease  social  interaction  among  females  after  brief  exposure  to 
a  male  (8).  In  humans,  OT  also  influences  more  complex  forms 
of  social  behavior  and  cognition  (9-14).  For  example,  inhaled 
OT  enhances  trusting  behavior  toward  other  individuals  in  eco¬ 
nomic  games,  potentially  by  suppressing  aversion  to  betrayal  risk 
(15),  and  promotes  cooperation  within  groups  (16).  However, 
inhaled  OT  also  provokes  cultural  and  racial  biases  (17).  OT 
inhalation  also  enhances  sensitivity  to  the  experiences  of  others 
by  promoting  vicarious  reward  and  empathic  pain  (10,  18,  19). 
Recently,  OT-mediated  processes  have  been  implicated  in  dis¬ 
orders  attended  by  dysfunctional  social  behavior,  including  au¬ 
tism,  fragile  X  syndrome,  and  schizophrenia  (19-22).  Notably, 
OT  treatment  improves  social  skills  in  individuals  with  autism 
(21,  23,  24),  a  spectrum  of  disorders  with  marked  deficits  in 
sensitivity  to  what  happens  to  others,  including  impairments 
in  understanding  and  responding  to  social  cues  (22,  25,  26). 


Variations  in  a  common  oxytocin-receptor  allele  are  linked  to 
autism  spectrum  disorders  and  are  associated  with  reduced  vol¬ 
ume  in  hypothalamus  and  anterior  cingulate  cortex  (27). 

Despite  a  growing  literature,  the  mechanisms  mediating  the 
influence  of  OT  on  sensitivity  to  what  happens  to  others  remain 
only  partially  understood  (9,  14, 19,  21,  28,  29).  OT  receptors  are 
localized  in  multiple  regions  of  the  brain,  with  especially  high 
density  in  areas  implicated  in  affective  and  social  processing.  In 
prairie  voles,  OT  receptors  are  densely  localized  in  the  amygdala, 
prelimbic  cortex  (homologous  to  the  cingulate  cortex  in  pri¬ 
mates),  and  nucleus  accumbens  of  the  striatum  (30).  Recently,  it 
has  been  shown  that  OT  selectively  inhibits  a  dedicated  channel 
from  the  central  nucleus  of  the  amygdala  to  periaqueductal  gray, 
ultimately  reducing  fear-induced  freezing  behavior  in  rats  (31). 
Similarly,  in  humans,  inhaled  OT  influences  on  social  behavior 
are  associated  with  reduced  blood  oxygen  level-dependent  (BOLD) 
signals  in  the  bilateral  amygdala  and  dorsal  striatum  (28,  29), 
consistent  with  the  OT-mediated  negative  affect  processing  in 
the  amygdala-cingulate  circuits  (22).  These  studies  provide  evi¬ 
dence  that  OT  influences  information  processing  in  neural  cir¬ 
cuits  implicated  in  emotion  and  social  behavior. 

Unlike  prairie  voles  or  humans  (2,  6,  9-11,  13-16,  30,  32,  33), 
rhesus  macaques  {Macaca  mulatta)  live  in  large,  hierarchical 
social  groups  with  promiscuous  mating  and  uniparental  female 
care  of  offspring.  Precisely  how  OT  might  influence  social  cog¬ 
nition  in  animals  with  this  type  of  social  structure  and  mating 
system,  if  at  all,  remains  unknown.  To  answer  this  question,  we 
capitalized  on  a  recent  finding  by  our  group  showing  that  rhesus 
macaques  are  sensitive  to  the  rewards  experienced  by  others,  and 
this  vicarious  reinforcement  is  sufficient  to  motivate  them  to 
work  to  reward  another  monkey  when  the  alternative  is  de¬ 
livering  reward  to  no  one  (34).  We  found  that  inhaling  OT  in¬ 
creased  OT  levels  in  cerebral  spinal  fluid  (CSF),  demonstrating 
transnasal  penetration  into  the  CNS.  Roughly  2  h  after  OT-in- 
halation  and  onward,  donor  monkeys  selectively  increased  the 
frequency  of  choosing  an  option  resulting  in  reward  to  an  adja¬ 
cent,  visible  monkey,  when  the  alternative  was  rewarding  no  one. 
In  the  same  context,  OT  also  increased  the  frequency  that 
donors  looked  at  the  recipient  monkey  and  prolonged  choice 
response  times.  In  contrast,  up  to  about  2  h  postinhalation,  OT 
increased  selfish  decisions  when  the  donors  had  the  option  to 
reward  self  over  the  other  monkey.  These  findings  invite  the 
hypothesis  that  OT  boosts  internal  vicarious  reinforcement  sig¬ 
nals  in  a  context-dependent  manner  in  neural  circuits  homolo- 
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gous  to  those  mediating  these  processes  in  humans.  Our  results 
demonstrate  that  OT  mediates  other-regarding  behavior  in  non¬ 
human  animals,  even  in  those  living  in  despotic  societies  with 
uniparental  care. 

Results 

Donor  monkeys  (hereafter,  “self”  or  “donor”)  performed  a  re¬ 
ward  allocation  task  with  an  unrelated  recipient  monkey 
(“other”)  (Fig.  1  A-C)  (34).  The  two  monkeys  were  seated  in 
adjacent  primate  chairs  (Crist),  100-cm  apart  and  at  45°  angles  to 
each  other.  Each  monkey  viewed  his  own  LCD  display,  and  had 
a  juice-tube  positioned  in  front  of  his  mouth  through  which  re¬ 
ward  could  be  delivered.  On  each  trial,  donors  chose  between 
two  visual  shapes,  associated  with  rewarding  self,  other,  or  nei¬ 
ther.  We  have  previously  shown  that  donors  typically  prefer  the 
shape  delivering  reward  to  other  over  neither  (34).  This  prefer¬ 
ence  is  enhanced  by  greater  familiarity  between  the  two  mon¬ 
keys,  and  is  abolished  if  the  recipient  monkey  is  replaced  with 
a  juice  collection  bottle,  thus  demonstrating  the  fundamentally 
social  nature  of  the  task  (34). 

For  each  session,  we  intranasally  (35)  delivered  25  international 
units  (IU)  of  OT  or  saline,  on  alternating  days,  to  two  males  using 
a  pediatric  nebulizer  30  min  before  performing  the  reward  alloca¬ 
tion  task.  A  session  composed  of  multiple  reward  allocation  trials 
after  either  OT  or  saline  administration  occurred  on  each  day 
(Methods).  Data  from  a  total  of  12  OT  and  10  saline  control  ses¬ 
sions  were  collected  from  two  donors  (MY  and  MO)  while  they 


engaged  in  the  reward  allocation  task  (Fig.  1  A-C )  with  an  un¬ 
related  recipient  monkey  (MD).  Five  OT  and  three  saline  sessions 
were  collected  from  MY,  and  seven  OT  and  saline  sessions  each 
were  collected  from  MO.  For  statistical  power,  we  present  data 
collapsed  across  the  two  donors,  unless  otherwise  stated. 

OT  inhalation,  compared  with  saline,  significantly  increased 
OT  concentration  in  CSF  as  measured  by  cervical  draws  (P  < 
0.05,  Welch  two-sample  t  test)  (Fig.  ID),  confirming  transnasal 
penetration  into  the  CNS.  Thirty  minutes  after  OT  administra¬ 
tion,  donors  began  the  reward  allocation  task.  For  choices  be¬ 
tween  delivering  reward  to  other  and  neither,  OT  selectively 
amplified  reward  donations  to  other  (Fig.  2).  Preference  for 
other  increased  linearly  over  time  after  OT  but  not  after  saline 
(OT:  different  from  0,  r2  =  0.26,  P  <  0.0005;  saline:  r2  =  0.01, 
P  =  0.47,  linear  regression)  (Fig.  2).  OT-induced  enhancement 
of  prosocial  choices  was  largest  in  the  later  half  of  a  given  session 
(i.e.,  ~110  min  after  OT  administration  and  ~80  min  after  task 
initiation;  preference  index  mean  difference  between  OT  vs. 
saline:  0.17,  P  <  0.00001,  Welch  two-sample  t  test)  (Fig.  2).  In¬ 
dividual  donors  showed  a  similar  pattern  (MY:  0.18,  P  <  0.00001; 
MO:  0.19,  P  <  0.01).  We  found  a  significant  difference  between 
the  two  treatment  conditions  even  when  we  averaged  across  the 
entire  duration  of  the  task  (mean  difference  of  0.12,  P  <  0.00001; 
MY:  0.15,  P  <  0.00001;  MO:  0.06,  P  <  0.05,  Welch  two-sample 
t  test). 

In  contrast,  in  the  early  half  of  a  given  session  (i.e.,  up  to  ~80 
min  into  the  task),  OT  slightly  but  significantly  increased  selfish 
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Fig.  1.  Reward  allocation  task.  (/\)  Experimental  setup.  ( B )  Trial  sequence.  Choice  {Upper)  and  cued  {Lower)  trials  were  randomly  interleaved.  The  eye-gaze 
cartoons  specify  the  task  intervals  during  which  the  donors  could  potentially  look  at  the  recipient  monkey.  MTf  movement  time;  RT,  reaction  time.  (C)  Stimuli 
associated  with  different  reward  outcomes  to  donors  and  recipient,  shown  separately  for  the  two  donors.  (D)  OT  concentration  in  the  CSF  after  intranasal  OT 
(in  red)  or  saline  (dark  gray).  *P  <  0.05,  Welch  two-sample  t  test.  Colored  outlines  on  the  datapoints  represent  animal  identities. 
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Fig.  2.  Intranasal  OT  promotes  both  vicarious  and  self  reinforcement. 
Choice  preference  index  (moving  averages  of  200  trials  per  session,  50-trial 
step)  for  OT  (red)  and  saline  (gray)  across  all  reward  options  (other  vs.  nei¬ 
ther,  self  vs.  other,  and  self  vs.  neither).  Datapoints  from  self  vs.  other  and 
self  vs.  neither  are  jittered  along  the  ordinate  for  visibility.  (Inset)  Unjittered 
and  magnified  data  from  self  vs.  other  trials.  Data  from  self  vs.  neither  trials 
were  effectively  overlapping  between  the  OT  and  saline  conditions,  and 
therefore  not  shown  in  an  unjittered  format.  OT,  12  sessions;  saline,  10 
sessions.  Lines  show  linear  regression  on  other  vs.  neither  trials. 


choices  on  self  vs.  other  trials  compared  with  saline  control 
(mean  difference  between  OT  and  saline  of  -0.02,  P  <  0.00001, 
Welch  two-sample  t  test;  Inset  in  Fig.  2  shows  unjittered  self  vs. 
other  trials),  but  had  no  effect  on  self  vs.  neither  trials  (mean 
difference  of  -0.002,  P  =  0.36).  Individual  donors  showed 
a  similar  selfish  bias  (MY:  -0.003,  P  <  0.06;  MO:  -0.04,  P  < 
0.00001).  The  absence  of  OT  effect  on  self  vs.  neither  trials  might 
be  due  to  the  fact  that  this  context  does  not  involve  a  potential 
reward  to  another  monkey,  although  we  cannot  rule  out  the 
possibility  that  donors  were  maximally  self-regarding  in  this 
context  in  the  absence  of  OT.  Thus,  OT  robustly  enhanced 
prosocial  choices  when  there  was  no  potential  cost  to  self,  but 
slightly  increased  selfish  choices  when  there  was  potential  for 
direct  self  reward. 

Donor  monkeys  often  shift  gaze  to  the  recipient  monkey  after 
making  a  choice,  and  this  attention  to  the  recipient  is  enhanced 
after  prosocial  choices  compared  with  selfish  choices  (34).  OT 
further  enhanced  this  overt  other-oriented  attention  to  the  re¬ 
cipient  after  donors  made  a  decision  on  other  vs.  neither  trials 
(Fig.  3/4)  (OT  vs.  saline:  mean  difference  of  4.70%,  P  <  0.05, 
Welch  two-sample  t  test).  In  contrast,  we  did  not  observe  any 
effects  of  OT  on  donor’s  attention  to  the  recipient  when  direct 
self  reward  was  involved  (self  vs.  neither:  mean  difference  of 
-0.36%,  P  =  0.95;  self  vs.  other:  0.03%,  P  =  0.99)  (Fig.  3/4).  We 
also  found  that  donors  looked  more  frequently  to  the  recipient 
when  rewards  were  delivered  to  him  compared  with  when 
rewards  were  delivered  to  self,  even  on  cued  trials  in  which 
rewards  were  delivered  by  computer  without  any  action  by 
donors  (gaze  frequency  on  self-cued  vs.  other-cued  trials:  OT, 
P  <  0.005;  saline:  P  =  0.05)  (Fig.  3/4).  However,  OT  did  not 
modulate  this  difference  in  social  attention  on  cued  trials  (all 
comparisons  P  >  0.23,  Welch  two-sample  t  test)  (Fig.  3/4),  sug¬ 
gesting  that  OT  enhances  other-oriented  attention  selectively 
following  prosocial  decisions  rather  than  in  response  to  anything 
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Fig.  3.  Intranasal  OT  enhances  attention  to  the  recipient  monkey  and 
increases  the  deliberation  time  for  making  donation  decisions.  (/A)  Gaze  to 
the  face  of  the  other  monkey  after  reward  delivery.  (Left)  Percentages  of 
gaze  shifts  to  the  recipient  monkey  on  choice  trials  (Upper)  and  cued  trials 
(Lower).  (Right)  Number  of  gaze  shifts  over  the  course  of  each  day  session 
for  other  vs.  neither  choice  trials  (moving  averages  of  200  trials  per  session, 
50-trial  step).  Lines  through  the  datapoints  show  linear  regressions.  (B)  Re¬ 
sponse  times,  measured  as  saccade  onset  times  following  target  onset  (ms). 
(C)  OT  reduced  choice  avoidance  [i.e.,  declining  to  choose  by  breaking  fix¬ 
ation  upon  target  onset  (such  as,  reward  options),  which,  in  the  task  resulted 
in  a  time  out  for  5  s].  *P  <  0.05,  Welch  two-sample  t  test. 


happening  to  the  other  monkey  (i.e.,  after  active  choices  on 
other  vs.  neither  trials).  As  in  the  other-oriented  choice  prefer¬ 
ence,  attention  to  the  recipient  monkey  also  increased  linearly 
over  time  after  OT  (slope  significantly  different  from  0 :  r2  = 
0.31,  P  <  0.00001,  linear  regression)  (Fig.  3 A,  Right).  The  fre¬ 
quency  of  looking  at  the  recipient  monkey  in  the  saline  control 
also  increased  over  the  course  of  the  session  (r2  =  0.19,  P  < 
0.005),  but  with  a  significantly  lower  rate  of  rise  than  the  OT 
condition  (differences  in  OT  and  saline  slopes  greater  than  zero: 
P  <  0.005,  permutation  test)  (Fig.  3 A).  This  finding  suggests  that 
OT  enhances  the  intensity  of  vicarious  reinforcement  in  part  by 
modulating  attentional  mechanisms. 

We  also  examined  the  time  required  by  monkeys  to  render 
a  decision.  Response  times  in  the  reward  allocation  task  are 
generally  slower  when  donor  monkeys  choose  between  delivering 
reward  to  other  vs.  neither,  compared  with  when  self  reward  is 
involved  (34).  OT  selectively  prolonged  response  times  on  other 
vs.  neither  trials  (mean  difference  between  OT  and  saline  of  26.0 
ms,  P  <  0.00001,  Welch  two-sample  t  test)  (Fig.  3 B),  possibly 
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reflecting  internal  processes,  such  as  deliberation  and  control. 
On  self  vs.  neither  and  self  vs.  other  trials,  however,  OT  only 
showed  a  trend  on  response  times  (self  vs.  other:  mean  differ¬ 
ence  of  14.78  ms;  self  vs.  neither:  8.72  ms;  both  P  <  0.13)  (Fig. 
3 B).  Finally,  on  some  trials,  donors  avoided  making  a  decision, 
opting  to  wait  until  the  next  trial  (although  they  could  not  predict 
the  subsequent  reward  options).  OT  reduced  this  choice  avoid¬ 
ance  behavior  across  all  trial  types  (all  P  <  0.05,  Welch  two- 
sample  t  test)  (Fig.  3C),  perhaps  because  of  overall  enhancement 
in  subjective  reinforcement. 

Inhaled  OT  thus  influenced  reward  donation  decisions  by 
rhesus  macaques  when  there  was  an  option  to  reward  another 
monkey  (other  vs.  neither  and  self  vs.  other,  but  not  self  vs.  nei¬ 
ther).  OT  enhanced  reward  donations  on  other  vs.  neither  trials, 
but  increased  selfish  behavior  on  self  vs.  other  trials  (Fig.  2). 
OT-induced  changes  in  attention  to  the  recipient  monkey  (Fig. 
3/4)  and  decision  time  (Fig.  3 B)  were  both  specific  to  the  donation 
context  (other  vs.  neither),  whereas  OT-induced  reductions  in 
choice  avoidance  behavior  (Fig.  3C)  were  global. 

Discussion 

Compared  with  some  other  nonhuman  primates,  social  behavior 
of  rhesus  monkeys  is  primarily  characterized  by  competition  and 
aggression,  and  shows  very  weak,  if  any,  inclination  toward  co¬ 
operation  (36,  37).  In  a  prior  study,  different  levels  of  endoge¬ 
nous  OT  were  reported  in  more  socially  affiliative  mother-reared 
compared  with  more  socially  agnostic  nursery-reared  macaques 
(38).  Here  we  show  that  exogenous  OT  promotes  social  donation 
behavior  in  rhesus  macaques,  as  it  does  in  more  egalitarian  and 
monogamous  species,  like  prairie  voles  and  humans.  OT-induced 
prosocial  donations  were  accompanied  by  enhanced  other-ori¬ 
ented  attention  and  decision  times.  In  contrast,  in  a  context  in 
which  there  was  a  potential  for  rewarding  self  or  another  mon¬ 
key,  OT  slightly  increased  the  tendency  for  donors  to  choose 
selfishly  without  influencing  overt  attention  and,  at  most,  mini¬ 
mally  affecting  decision  times.  The  absence  of  OT-induced  en¬ 
hancement  of  overt  attention  on  these  trials  suggests  that  OT 
modulates  other-oriented  preferences  through  vicarious  re¬ 
inforcement  (34).  These  findings  are  consistent  with  context- 
dependent  effects  of  OT  on  human  social  behavior  (16,  17,  39) 
(for  a  review  of  human  social  processing,  see  ref.  40),  implying 
similar  neural  mechanisms. 

Given  the  context-specific  increase  in  attention  to  the  other 
monkey  and  more  deliberative  decision  latency,  it  is  conceivable 
that  these  behaviors  are  related.  Several  hypotheses  are  plausible. 
On  the  one  hand,  OT  may  increase  attention  to  the  other  monkey 
via  neural  circuits  mediating  orienting  behavior,  including  amyg¬ 
dala,  parietal  cortex,  and  superior  colliculus.  Increased  attention  to 
the  recipient  may  enhance  vicarious  reinforcement  experienced 
from  delivering  juice  to  him.  Alternatively,  OT  may  influence 
neural  circuits  involved  in  decision-making,  including  the  striatum 
and  anterior  cingulate  cortex  (see  introductory  paragraphs). 
Slowed  response  times  may  reflect  more  deliberate  processing  of 
the  potential  outcomes  available  (41).  A  future  study  designed  to 
probe  the  temporal  evolution  of  OT-induced  effects  on  attention 
and  decision-making  will  be  needed  to  resolve  these  hypotheses. 

The  direction  of  OT-induced  social  enhancement  also  appears  to 
vary  as  a  function  of  time.  OT  initially  enhanced  self  reinforcement 
but  later  amplified  vicarious  reinforcement,  although  the  largest 
OT-induced  effects  were  prosocial.  Although  this  interaction  be¬ 
tween  time-dependent  and  context-dependent  effects  of  OT  may 
be  specific  to  our  reward  allocation  task  and  thus  can  only  be  ex¬ 
trapolated  with  caution,  these  results  suggest  that  OT  may  in¬ 
fluence  self-  and  other-regarding  behaviors  via  distinct  underlying 
neural  mechanisms. 

Why  might  OT  promote  self  reinforcement  bias  on  self  vs.  other 
but  not  on  self  vs.  neither  trials?  The  key  difference  between  the 
two  contexts  is  the  alternative  option.  In  one  context,  the  alter¬ 


native  option  has  a  social  consequence  (i.e.,  rewarding  the  re¬ 
cipient),  whereas  in  the  other  context,  the  alternative  option  does 
not  (i.e.,  nothing  happens  to  either  donor  or  recipient).  OT-in- 
duced  self  reinforcement  may  depend  on  the  contrast  between 
rewarding  self  and  another  individual.  We  hypothesize  that  when 
a  decision  context  presents  this  contrast,  OT  can  promote  selfish 
behavior.  OT  influences  on  self  and  vicarious  reinforcement  (16, 
17,  39)  thus  appear  to  depend  on  the  social  state  of  the  underlying 
neural  circuits. 

Previous  studies  in  monogamous  prairie  voles  and  promiscuous 
montane  voles  ( Microtus  montanus)  have  suggested  that  mating 
system  may  be  a  key  predictor  of  OT  influences  on  social  behavior 
through  the  topology  of  OT  receptor  localization  in  neural  cir¬ 
cuits,  mediating  reinforcement  and  motivation  (33).  A  more 
general  difference  between  prairie  voles  and  montane  voles  is  the 
frequency  and  intensity  of  social  interaction  (33).  Compared  with 
montane  voles,  prairie  voles  are  biparental,  show  more  selective 
aggression,  and  spend  more  time  in  close  physical  proximity  (33). 
Humans  and  rhesus  macaques,  too,  are  highly  social  mammals; 
intranasal  OT  induces  prosocial  tendencies  in  humans  (15,  16) 
and,  as  we  now  report,  in  rhesus  macaques.  These  findings  suggest 
that  OT  may  play  a  critical  role  in  modulating  social  behavior  in 
highly  gregarious  mammals,  regardless  of  mating  system  or  pa¬ 
rental  care  strategy. 

Intranasal  administration  of  OT  in  humans  has  also  been 
shown  to  increase  gaze  to  the  eyes  of  others  (19).  We  found  that 
OT  enhanced  gaze  directed  at  the  face  of  the  other  monkey 
following  active  social  decision-making  but  not  following  passive 
reward  delivery.  This  finding  invites  the  possibility  that  OT  gates 
the  activity  of  attention  circuits  in  the  brain  specifically  during 
active  interaction  with  others.  Evidence  from  human  functional 
neuroimaging  studies  is  consistent  with  this  idea.  For  example, 
OT  selectively  modulates  BOLD  signal  in  the  anterior  cingulate 
cortex,  amygdala,  midbrain,  and  dorsal  striatum  during  a  trust 
game  involving  other  human  players,  but  not  during  a  nonsocial 
decision-making  task  (29).  Functional  connectivity  between  the 
amygdala  and  midbrain  structures  is  also  reduced  by  OT  when 
human  participants  view  emotional  faces  (28).  Finally,  OT 
reduces  the  subjective  evaluation  of  aversively  conditioned  faces, 
and  this  reduction  is  accompanied  by  suppressed  BOLD 
responses  in  the  amygdala  and  the  fusiform  gyrus  (42). 

Consistent  with  our  results,  OT  modulates  deliberation  times 
during  social  decision-making  in  humans.  For  example,  OT  slows 
overall  evaluation  time  for  rating  faces  in  a  nonspecific  manner, 
regardless  of  whether  the  images  were  aversively  conditioned  or 
not  (42).  OT  can  also  speed  up  decision  times;  for  example,  OT 
decreased  overall  key  press  reaction  times  for  evaluating  in¬ 
group  favoritism  and  out-group  derogation  in  an  implicit  asso¬ 
ciation  test  (17). 

OT  enhanced  the  frequency  of  prosocial  decisions  in  the  absence 
of  opportunity  for  direct  self  reward,  but  provoked  an  increase  in 
selfish  decisions  when  choosing  between  self  and  other.  Such  a  dual 
function  has  also  been  reported  in  humans.  OT  can  both  promote 
cooperation  and  increase  out-group  bias  depending  on  behavioral 
context  (16, 17,  39).  Thus,  OT  does  not  appear  to  have  a  universal 
prosocial  influence  on  behavior,  but  rather  amplifies  ongoing  social 
information  processing  (21),  perhaps  by  influencing  already  existing 
preferences.  It  is  plausible  that  OT  mediates  prosociality  and 
generosity  only  in  an  indirect  manner.  Alternatively,  OT  may  play 
a  more  direct  and  causal  role  in  modulating  context-dependent 
social  information  processing  (e.g.,  refs.  27-29  for  neural  evidence), 
specifically  by  enhancing  the  gain  of  neural  circuits  mediating 
vicarious  reinforcement  and  attention. 

Recently,  OT  has  been  evaluated  for  potential  therapeutic  use  in 
clinical  conditions  attended  by  dysfunctional  social  behavior,  such 
as  autism  spectrum  disorders,  antisocial  personality  disorder,  and 
schizophrenia  (20-24,  43,  44).  Notably,  the  intranasal  nebulization 
method  (35)  we  developed  here  is  well-tolerated  by  children  for 
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delivery  of  other  therapeutics  (i.e.,  albuterol),  thus  opening  up 
avenues  for  early  OT  intervention  in  neuropsychiatric  conditions 
with  social  deficits.  Furthermore,  choice-specific  effect  of  OT  on 
increasing  other-oriented  attention  suggests  a  potential  need  for 
active  decision-making  during  OT  interventions. 

The  current  finding  opens  up  new  opportunities  for  uncover¬ 
ing  the  mechanisms  underlying  the  influences  of  OT  on  social 
behavior  in  a  species  much  more  closely  related  to  humans  than 
rodents.  Rhesus  monkeys  have  long  served  as  the  primary  model 
species  for  probing  the  neural  mechanisms  mediating  high-level 
cognition.  Given  the  strong  similarities  in  social  behavior  and 
cognition,  and  the  apparent  homologies  in  underlying  neural 
circuitry,  the  rhesus  macaque  provides  a  powerful  model  for 
probing  the  mechanisms  mediating  some  of  the  basic  behaviors 
that  make  complex  human  social  interactions  possible. 

Methods 

General  Procedures  and  Behavioral  Task.  All  procedures  were  approved  by  the 
Duke  University  Institutional  Animal  Care  and  Use  Committee.  Two  donor 
monkeys  (MY  and  MO)  and  a  recipient  monkey  (MD)  participated  in  the 
study.  All  animals  underwent  standard  surgical  procedures  for  implanting 
a  head-restraint  prosthesis  at  least  6  mo  before  the  present  study.  The  head- 
restraint  prosthesis  allowed  us  to  monitor  eye  position,  sampled  at  1,000  Hz 
(SR  Research;  Eyelink),  as  well  as  conduct  single-unit  recordings  in  other 
experiments,  not  reported  here.  Both  the  donor  and  recipient  were  head- 
restrained  throughout  the  experiment.  Donors  and  recipient  were  unrelated, 
middle-ranked,  and  not  cage  mates.  Face  of  recipient  (other;  corresponding 
horizontal  and  vertical  eye  positions)  was  empirically  mapped.  Rewards  were 
0.5-1 .0  mL  of  cherry-flavored  juice.  Within  each  block,  reward  size  was 
constant  for  all  three  outcomes.  A  separate  solenoid  was  designated  for 
rewarding  neither  that  only  produced  clicks  but  delivered  no  fluid.  To  pre¬ 
vent  monkeys  from  forming  secondary  associations  between  solenoid  clicks 
and  different  reward  types,  all  solenoid  valves  (including  the  one  used  to 
deliver  "neither"  reward)  used  to  deliver  juice  rewards  were  placed  in  an¬ 
other  room.  Masking  white  noise  was  also  played  in  the  experimental  room. 

Donors  began  the  trial  by  shifting  gaze  (±  2.5°)  to  a  central  stimulus  (0.5°  x 
0.5°),  and  maintained  fixation  (for  200  ms).  Choice  and  cue  trials  were  pre¬ 
sented  at  equal  frequencies  and  randomly  interleaved.  On  choice  trials  (Fig. 
US),  two  visual  targets  (4°  x  4°)  appeared  at  two  random  locations  of  7° 
eccentricity  and  reflected  about  the  vertical  meridian.  Donors  shifted  their 
gaze  to  one  target  (±  2.5°)  to  indicate  their  choice.  On  cued  trials  (Fig.  US), 
donors  maintained  fixation  while  a  cue  appeared  centrally  (for  500  ms).  On 
both  trial  types,  the  reward  onset  was  followed  by  a  0  to  0.9  s  delay.  Donors 
could  freely  look  around  for  0-0.9  s  following  making  a  choice  and  for  an¬ 
other  1  s  after  the  reward  onset.  Data  from  error  trials  are  not  included 
in  analyses. 

Data  from  12  OT  (MY:  5,  MO:  7)  and  10  saline  (MY:  3,  MO:  7)  sessions  were 
collected  on  strictly  alternating  days.  Each  day  session  was,  on  average,  1,274  ± 
141  (mean  ±  SEM)  trials.  Within  each  day  session,  several  blocks  of  the  task  (a 
median  of  6  and  6.5  blocks  for  OT  and  saline,  respectively)  were  completed  by 
the  donors.  Each  of  these  blocks  typically  consisted  of  192  ±  10  (mean  ±  SEM) 
and  205  ±  1 5  trials  for  OT  and  saline,  respectively. 

Intranasal  OT  Protocol.  Donor  monkeys  were  transported  in  the  primate  chair 
from  the  colony  room  to  the  experimental  room.  After  stabilizing  their  heads, 
OT  (25  lU/mL;  Agrilabs)  was  delivered  via  nebulization  (Pari  Baby  Nebulizer) 
into  the  nose  and  mouth  continuously  for  5  min  (5  lU/min)  when  the  donor 
monkeys  were  fully  awake.  On  alternating  days,  nebulized  saline  served  as 
a  control.  Before  experimental  sessions,  donor  monkeys  were  first  habituated 
to  the  nebulizer  and  then  accustomed  to  saline  delivery  using  the  nebulizer  in 
an  incremental  fashion  until  they  were  completely  relaxed  during  the  pro¬ 
cedure,  which  typically  took  about  a  week.  In  fact,  donor  monkeys  showed  no 
distress  during  this  procedure.  Testing  began  exactly  30  min  after  each  treat- 
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ment,  at  which  time  a  recipient  monkey  was  brought  to  the  experimental 
setup.  In  the  guinea  pig  CNS,  radioactively  labeled  OT  lasts  up  to  4  h  (45).  In 
humans,  intranasal  delivery  of  a  similar  peptide,  vasopressin  (differing  by  only 
two  amino  acids),  increases  its  concentration  in  the  CSF  after  10  min,  and  el¬ 
evated  vasopressin  levels  are  maintained  for  more  than  80  min  after  admin¬ 
istration  (35).  In  that  study  (35),  vasopressin  levels  increased  significantly  after 
30  min.  Previous  studies  in  humans  have  not  measured  inhaled  OT  uptake  into 
the  CNS.  Fig.  ID  plots  CSF  OT  levels  in  monkeys  35  min  after  inhalation, 
demonstrating  efficacy  of  the  intranasal  nebulization  method  (see  below). 
Note  that  the  mask  was  always  pressed  very  tightly  to  minimize  potential 
leakage,  but  nonetheless  leakage  could  have  occurred.  It  is  worth  noting  that 
CSF  OT  levels  may  have  continued  to  increase  after  the  time  of  CSF  mea¬ 
surement,  warranting  caution  in  linking  absolute  CSF  OT  levels  with  changes 
in  behavior.  Despite  these  uncertainties,  our  nebulization  technique  resulted 
in  a  ~2. 5-fold  increase  in  CSF  OT  levels  roughly  0.5  h  after  inhalation. 

CSF  OT  Protocol.  To  determine  whether  inhaled  OT  penetrates  the  CNS  after 
nebulization,  OT  concentration  in  CSF  was  measured  via  cervical  punctures 
(on  average  35  min  after  the  beginning  of  inhalation).  Cervical  punctures 
were  performed  by  a  licensed  veterinarian,  and  targeted  the  cisterna  magna 
through  the  juncture  between  the  occipital  base  and  atlas  (Cl)  through  the 
atlanto-occipital  membrane.  Monkeys  were  first  anesthetized  with  ketamine 
(3  mg/kg,  i.m.)  and  dexdomitor  (0.075  mg/kg,  i.m.).  To  reverse  anesthesia,  we 
administered  antisedan  (0.075  mg/kg,  i.m.)  once  the  animal  was  returned  to 
its  cage  after  the  draw.  Approximately  0.5  mL  of  CSF  was  drawn  using  a  24  to 
27  gauge  needle.  At  the  performing  veterinarian's  discretion,  bupivacaine 
was  administered  subcutaneously  at  the  insertion  site  following  needle  re¬ 
moval.  CSF  was  immediately  frozen  on  dry  ice  and  sent  off-site  to  be  assayed 
for  OT  (Biomarkers  Core  Labs,  Yerkes  National  Primate  Research  Center, 
Atlanta,  GA)  using  a  commercially  prepared  kit  [Assay  Designs  (now  Enzo 
Life  Sciences);  cat.  #  900-153:  Oxytocin  ELISA  kit,  with  very  low  reactivity 
with  vasopressin].  Samples  were  assayed  "neat"  with  a  range  of  15.6-1,000 
pL  assay  volume.  This  assay  has  near-zero  reactivity  with  vasopressin,  which  is 
chemically  similar  to  OT,  thus  providing  specific  quantitation  of  OT. 

Data  Analysis.  Preference  index  was  a  contrast  ratio  of  frequency  of  choosing 
an  option,  nA  or  nB: 

Preference  Index  =  — — — . 

t>a  +  TIb 

For  choices  between  self  vs.  other,  nA  and  nB  were  number  of  choices  to 
reward  other  and  self,  respectively.  For  choices  between  other  vs.  neither,  nA 
and  nB  were  number  of  choices  to  reward  other  and  neither,  respectively. 
Finally,  for  choices  between  self  vs.  neither,  nA  and  nB  were  number  of 
choices  to  reward  neither  and  self,  respectively.  Indices  ranged  from  -1  to  1, 
with  1  corresponding  to  always  choosing  the  "prosocial"  option  to  reward 
the  recipient  monkey  (when  that  was  an  option)  or  to  withhold  reward  from 
self  (self  vs.  neither).  An  index  of  -1  indicated  that  donors  always  chose  an 
"antisocial"  option  to  reward  self  (when  that  was  an  option)  or  to  withhold 
reward  from  the  other  monkey  (other  vs.  neither).  Preference  index  of  0  in¬ 
dicated  indifference.  Frequency  of  donors  looking  at  recipients  was  com¬ 
puted  from  number  of  gaze  shifts  to  the  recipient's  facial  region  (within  ± 
8.5°  spanning  from  the  center  of  the  recipient's  face).  Reaction  times  (time 
from  target  onset  to  movement  onset)  were  computed  using  a  20°/s  velocity 
threshold  (46). 
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